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Four-Dimensional Modulation With an Eight- 
State Trellis Code 


By A. R. CALDERBANK and N. J. A. SLOANE* 
(Manuscript received January 9, 1985) 


A trellis code is a “sliding window” method for encoding a binary data 
stream {a‘}, a' = 0, 1, as a sequence of signal points drawn from R”. The rule 
for assigning signal points depends on the state of the encoder. In this paper 
n = 4, and the signal points are 4-tuples of odd integers. We describe an 
infinite family of eight-state trellis codes. For k = 3, 4, 5, --- we construct a 
trellis encoder with a rate of k bits/four-dimensional signal. We propose that 
the codes with rates k = 8 and 12 be considered for use in modems designed 
to achieve data rates of 9.6 kb/s and 14.4 kb/s, respectively. 


I. INTRODUCTION 


A trellis code is a “sliding window” method for encoding a binary 
data stream {a'}, a' = 0, 1, as a sequence of signal points {x'} drawn 
from R”. The set of possible signal points is finite, and this set is 
called the signal constellation. The purpose of coding is to gain noise 
immunity beyond that provided by standard uncoded transmission at 
the same data rate. In this paper n = 4, and the signal points are 
drawn from (2Z + 1)*, the lattice of 4-tuples of odd integers. We shall 
regard transmission of a four-dimensional signal as one use of the 
channel, and we measure the rate of the code in bits per channel use. 
The four-dimensional signal space can be realized by using two space- 
orthogonal electric field polarizations to communicate on the same 
carrier frequency. It is also possible to regard each four-dimensional 
symbol as two consecutive two-dimensional symbols. 


* Authors are employees of AT&T Bell Laboratories. 
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Ungerboeck’ described a technique called set partitioning, which 
assigns signal points to successive blocks of input data. The rule for 
assigning signal points depends on the state of the encoder. Unger- 
boeck constructed simple trellis codes providing the same noise im- 
munity as is given by increasing the power of uncoded transmission 
by factors ranging from 2 to 4 (coding gains ranging from 3 to 6 dB). 
Calderbank and Mazo” have given a different algebraic description of 
trellis codes. Trellis codes with a rate of 4 bits/two-dimensional symbol 
have recently been proposed for use in modems designed to achieve 
data rates of 9.6 kb/s on dial-up voice telephone lines. These codes 
use the signal constellation shown in Fig. 1, which was originally 
described by Campopiano and Glazer,® and gain 4 dB over uncoded 
transmission at the same rate. In Section III of this paper we describe 
the first code in our infinite family. This code has a rate of 8 bits/ 
four-dimensional symbol and promises a gain of 4.7 dB over uncoded 
transmission. The signal constellation consists of 512 four-dimen- 
sional signal points. Transmission of two consecutive two-dimensional 
signals using one of the proposed trellis codes with a rate of 4 bits/ 
two-dimensional symbol requires 1024 = 32? four-dimensional signal 
points. Furthermore, the restriction of the 512-point constellation to 
the first two coordinates, or to the last two coordinates, is the 32-point 
constellation shown in Fig. 1. The 0.7-dB improvement in performance 
is derived from reducing the average transmitted power. 

In Section IV we briefly describe the second code in the family, 
which has a rate of 12 bits/four-dimensional symbol and promises a 





Fig. 1—Signal constellation for proposed trellis codes with rate 4 bits/two-dimen- 
sional symbol. 
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coding gain of 4.9 dB over uncoded transmission. We propose using 
this code in modems designed to achieve data rates of 14.4 kb/s. 

We begin by presenting a rate 3/4 binary convolutional code that is 
basic to the construction of the new trellis codes. In Section V we 
describe the general code in the family, with rate k bits/four-dimen- 
sional symbol, for k = 3, 4, 5 --- , and we show that in the limit, as 
k— oo, the coding gain is asymptotic to 10 logigx ~ 4.9715 dB. The 
difference between this limiting coding gain and that provided by the 
code with k =-12 is very small. 

After this paper was submitted, we discovered that Forney et al.* 
had independently proposed a different rate k = 8 code with approxi- 
mately the same performance. Wilson, Sleeper, and Smith’ have 
described simple trellis codes (with up to four encoder states) that use 
four-dimensional signal constellations. 


Il. A RATE 3/4 BINARY CONVOLUTIONAL CODE 


We assume that binary data are being encoded at a rate of k bits/ 
signal point and that the data enter the encoder in k parallel sequences, 
fai}, {a5}, --- , {ai}. We assume that the output x! of the trellis 
encoder at time i depends not only on the present values a}, a}, 

- , a}, of the input sequences, but also on the previous »; > 0 bits of 
the jth sequence. If v; = 0 for some j, then a}, a}, a}, --- is said to be 
a sequence of uncoded bits. The constraint length v is given by 
y= >*, v;. The output x’ of the encoder is a fixed vector-valued 


function x of the vy + k binary variables aj, --- , aj; a3, --- , 
ah +++ 3 ah, --- , ay *. That is, 

xi= x(aiai? sek ai"; vee ai, “ie ak’), 
The v-tuple (aj) --- af"; ay}... ab’ .-. 3 ab’) is the state of the 


encoder and there are 2” states. Figure 2 shows a state transition 
diagram for a trellis code with k = 3, vy; = 0, ve = 1, vg = 2. The average 
transmitted signal power P is given by 


1 . 7 7 
P= Fae 2 I|x(ai ++) aT’s +++ 5 ak --+ ak”) II? 


Basic to the trellis codes constructed below is a certain rate 3/4 
binary convolutional code with total memory 3 and free distance 4. 
The encoder is presented in Fig. 3, which is taken from Ref. 6 (Fig. 
10.3, p. 292). The three parallel input sequences determine the output 
sequence {v' = (vj, v5, U3, v4)} according to the following rules: 

vi = ai, 


Ve = ai +abh+azy'+ az", 
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vs =a, + ay! + a3 + a3”, 
vy = a, + ab + 3 +05”. 


The triple a} ‘a5a5°? is the state of the encoder. The possible transi- 
tions between states are shown in Fig. 2. The edge joining state 
asa ak? to state abasay! is labeled with the outputs v’' = 
(vi, vb, vs, vi) and v' = (0), 04, 05, 04) corresponding to this transi- 
tion. Note that 


0) = ai, 
do = @, tah + az! + az", 
03 = a, t+ ay) + a3 + as”, 


bY, = ai + ab + a5 + af?. 


i i+1—~e TIME 


_ (0; 01; 010) 
AND x(1; 01; 010) 


_x(1; 11; 111) 
AND x{0; 11; 111) 





ENCODER STATES 


i-1 i-1 . i-2 
a2, a3 a 93 


Fig. 2—A state transition diagram for a trellis code with k = 3, »; = 0, vp = 1, v3 = 2. 
(Every edge represents two possible transitions.) 
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Fig. 3—A rate 3/4 binary convolutional code. 


We change from 0, 1 notation to +1 notation (0 @ +1 and 1< -—1). 
An edge joining two states is now labeled with pairs of vectors +(w,, 
We, W3, W4), Where w; = £1, i = 1, 2, 3, 4. This defines a trellis encoder 
with a rate of 3 bits/four-dimensional symbol. The minimum squared 
distance of this trellis code is simply four times the free distance of 
the original binary convolutional code, namely, 16. This is because 0 
opposite 1 contributes 1 to the free distance, whereas 1 opposite —1 
contributes 4 to the squared minimum distance. 

Transmission at the higher rates of 8 and 12 bits/four-dimensional 
symbol requires more channel symbols. Indeed, to achieve any coding 
gain, we have to use more symbols than are required by uncoded 
transmission at the same rate. 


HI. A TRELLIS CODE WITH RATE 8 BITS/FOUR-DIMENSIONAL 
SYMBOL 


Uncoded transmission at the rate 4 bits/two-dimensional symbol 
uses the rectangular signal constellation shown in Fig. 4. To achieve 
uncoded transmission of a four-dimensional symbol at a rate of 8 bits/ 
symbol, simply take two copies of this scheme. There are 256 possible 
signals and the average power is 4(17 + 3°)/2 = 20. Since the minimum 
squared distance between distinct signals is 4, we have 


Ginin £8 
P uncoded 7 20 


For coded transmission we shall use 2 X 256 = 512 signal points. 
Representative signal points are listed in Table I. The remaining 
points are obtained from these representatives by permuting the 
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Fig. 4—The rectangular constellation for uncoded transmission at 4 bits/two-dimen- 
sional symbol. 


Table I—The signal 
constellation for coded 
transmission at 8 bits/four- 
dimensional symbol. All 
permutations of coordinates 
and all sign changes are 


allowed. 
1/16 x 
Representative Energy Number 
(1111) 4 1 
(3111) 12 4 
(3311) 20 6 
(5111) 28 4 
(3331) 28 4 
(5311) 36 12 
(3333) 36 1 


coordinates and changing signs in all possible ways. For example, 
(3131), (1815), and (5111) are all signal points (where x denotes 
—x). For every vector w = (Wj, Wo, W3, W4) With w; = +1, i = 1, 2, 3, 4, 
let S(w) be the set of 32 signal points (x, x2, x3, x4) satisfying x; = w; 
(mod 4), for 1 = 1, 2, 3, 4. The sets S(w) partition the signal 
constellation into 16 equal parts. The set S(1111) is shown in Table 
II and the other sets are obtained from S(1111) by changing signs. For 
example, S(1111) is obtained from S(1111) by changing the signs of 
the second and fourth entries. The distance d(A, B) between two sets 
of vectors A and B is given by 


d(A, B) = min {|x — yl}. 
xE€A,yEB 
The partition into sets S(w) satisfies the following metric properties: 
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Table I[—The set $(1111). All 
permutations of coordinates 
are allowed. 


Signal Point Energy Number 


(1111) 4 1 
(3111) 12 4 
(3311) 20 6 
(5111) 28 4 
(3331) 28 4 
(5311) 36 12 
(3333) 36 1 


(M1) if x,y © S(w) then ||x — yl? = 16, 


(M2) if v#w then d*(S(v), S(w)) = |v -— wll”. 
To verify (M1) let x = (x1, x2, x3, x4) and y = (1, yo, 3, ya). Then 
x;y; for some i. Since x; = y; (mod 4), we have ||x — y||? = 16. To 
verify (M2) let x = (x1, X2, x3, x4) © S(v) and y = (1, yo, ys, Ya) € 
S(w). If x; ¥ y; (mod 4) then |x; — y,;|? = 4. Hence |x — yl? = 
|v — w||? and equality holds when x = v and y = w. 

In Section II we described a trellis code with a rate of 3 bits/four- 
dimensional symbol and minimum squared distance d?,i, = 16. To 
achieve the higher transmission rate of 8 bits/four-dimensional sym- 
bol, we add 5 uncoded bits. There are now eight parallel input se- 
quences {ai}, --- , {a}. The sequences {a3}, {a5} determine the state 
a ‘a as? of the encoder as in Fig. 3. An edge joining two states that 
was originally labeled by the pair of vectors +v is now labeled by the 
64 vectors in S(v) U S(—v). This is because there are 64 parallel 
transitions between states ab ‘ta}‘ay? and abaai* corresponding to 
the 64 possible inputs aja), --- a}. We allow any fixed assignment of 
channel symbols in S(v) U S(—v) to inputs aja) --- a}. 

Consider the distance properties of the high-rate code. Properties 
(M1) and (M2) guarantee that the squared distance of any error event 
of length 1 is at least 16. Consider any error event in the eight-state 
trellis of length greater than 1. If the squared distance for the low-rate 
code is 


l 
»y Ive - v'I?, 
then the squared distance for the high-rate code is at least 
>» a*(S(v'), S(v')). 
i=l 


Property (M2) now implies that the minimum squared distance of the 
high-rate code is at least 16. 
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The average signal power P of the 512-point signal constellation is 
given by 


_ 1644 xX 14+12 x 4+ 20 xX 6+ 28 X 8+ 36 X 13) | 





P 51D 27. 
Thus, 
Grn) _ 16 
P coded 7 27 
and the coding gain (in decibels) is 
(Ginin/P) coded 16/27 
10 1 Ta aan ame (Goa = 4. 
ee eae i lo8i0 4/20 : ut ob 


IV. A TRELLIS CODE WITH RATE 12 BITS/FOUR-DIMENSIONAL 
SYMBOL 


Uncoded transmission at the rate of 6 bits/two-dimensional symbol 
uses the 64-point rectangular constellation shown in Fig. 5. To achieve 
uncoded transmission of a four-dimensional symbol at a rate of 12 
bits/symbol, simply take two copies of this scheme. There are 64? = 
2" possible signals, and the average signal power P is 4(17 + 3? + 
5? + 7°)/4 = 84. Thus, 


Ginin _4 
BP uncoded . 84 





Fig. 5—A rectangular constellation for uncoded transmission at 6 bits/two-dimen- 
sional symbol. 
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Table II[—The signal constellation for coded transmission at 12 bits/ 
four-dimensional symbol. All representatives are taken from $(1111). 





Representative Energy (1/16) x Number 

(1111) 4 1 
(3111) 12 4 
(3311) 20 6 
(5111), (3331) 28 4+4=8 
(5311), (3333) 36 12+1=13 
(5331) 44 12 
(7111), (5333), (5511) 52 4+4+6=14 
(7311), (5531) 60 12 + 12 = 24 
(7331), (5533) 68 12+6=18 
(7333), (5551), (7511) 76 4+4+12= 20 
(9111), (7531), (5553) 84 4+24+4= 32 
(9311), (7533) 92 12+ 12 = 24 
(9331), (7711), (7551), (5555) 100 12+6+12+1=31 
(7553), (9333), (9511), (7731) 108 12+4+12+12=40 
(9531), (7733) 116 24+ 6= 30 
(9533), (11111), (7751), (7555) 124 12+4+12+4=32 
(1311), (9711), (9551), (7753) 132 12+12+12+12= 48 
(1133 au (9731), (9553) _ 140 12+ 24412 = 48 
(11333), (11511), (9733), (7755), 148 4+12+12+6+4=38 

G77 71) 
(11531), (9555), (9751), (7773) 156 24+4+24+4=56 
(11533), (9911), (9753) 164 only 13 


For coded transmission we use 2 X 2” = 2} signal points. As in 
Section III we partition the signal constellation into 16 sets S(w) 
according to congruence of the entries modulo 4. Each set S(w) 
contains 512 signal points. Representative signal points are listed in 
Table III, where the representatives are all taken from S(1111). 

To achieve the transmission rate of 12 bits/four-dimensional sym- 
bol, we add 9 uncoded bits to the low-rate trellis code described in 
Section II. There are now 1024 parallel transitions between states 
ab aias? and ajajai' in the eight-state trellis. If the edge corre- 
sponding to this transition was originally labeled +v, it is now labeled 
with the 1024 vectors in S(v) U S(—v). The metric properties (M1) 
and (M2) guarantee that the squared minimum distance of the high- 
rate code is equal to the squared minimum distance of the low-rate 
code, which is 16. An easy calculation shows that the average signal 


power P is 108.625, so 
(din) 
P J eodea 108.625 © 





The coding gain is 


10/108.625 


4/84 )= 4.904 dB. 


10 logio ( 
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V. ASYMPTOTIC PERFORMANCE OF A FAMILY OF CODES 


To achieve coded transmission at the rate of k bits/four-dimensional 
signal, we add k — 3 uncoded bits to the low-rate trellis code described 
in Section II. There are 2" parallel transitions between states 
ab tai as? and ajasai" in the eight-state trellis. Coded transmission 
requires 2”*! signal points. The points of the lattice (2Z + 1)‘ lie in 
shells around the origin consisting of 16 vectors of energy 4, 64 vectors 
of energy 12, and so on (see Table III). The 2"*' signal points are 
obtained by taking all points of energy 4, 12, 20, --- and just enough 
points of a final shell to bring the total number up to 2**!. The signal 
constellation is partitioned into 16 sets S(v) according to congruence 
of the entries modulo 4. Each set contains 2’~° signal points. Edges in 
the eight-state trellis originally labelled +w are now labeled with the 
2*-2 vectors in S(v) U S(—v). The metric properties (M1) and (M2) 
guarantee that the minimum squared distance of this trellis code is 
16. 

Consider the asymptotic performance of this family of codes. For 
simplicity suppose that the signal constellation of each code in the 
family is a complete union of energy shells. If x is a vector in the 
lattice (2Z + 1)*, then ||x||? = 4 (mod 8), since ||x||* is the sum of four 
odd squares. A classical result, due to Jacobi and to Legendre, is that 
every positive integer of the form 8n + 4 is a sum of four odd squares 
in o(2n + 1) ways, where o(m) is the sum of divisors of m. The 
generating function 


16 > o(m)q*” 


m21 

m odd 
expresses the fact that there are 16 o(m) vectors of energy 4m in the 
lattice (2Z + 1)*. The factor of 16 arises from the 16 possible sign 
changes. 

The following estimates (which are proved in the Appendix) will be 

used to calculate the average energy of the vectors x in (2Z + 1)* with 
|x ||? < 4n: 








22.2 
yY o(m) = ~~ + O(n log n), (1) 
lsmsn 32 
m odd 
rn? 
yy mo(m) = + O(n’log n). (2) 
l<xmsn 48 
m odd 


The average energy P of the vectors x in (2Z + 1)* with ||x||? < 4n is 
given by 
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_ 4(x’n®/48 + O(n?log n)) _ 8 
~— ?n?/82 + O(nlogn) 3 ee O08 10): 


Therefore, 





ints = oo Ee 6 ‘ 
( P Jess = (8/3)n + O(log n) 7 aye O(log n)/n " (3) 


The number of points in the signal constellation is 


2:22 
146 + o(m)= aD O(n log n). 
out 
For uncoded transmission we use just half this many points. The signal 
constellation is the set of all 4-tuples x = (x1, xo, x3, x4), where x; = 
+1, +3, --- , +(2a — 1). The number of points in this constellation is 


16a‘, so we choose a to make a‘ close to 17n”/64. Now 
Aq? —1 
P+ 324524... +Qa-yr= ee), 


so the average signal power P is given by 


4a(4a? — 1) 4 
= eg = (4 2 1). 
3 3 | : 

Since the minimum squared distance between distinct signals is 4, we 


have 


'P 





dinin yee ee eee cae 
P uncoded . 4(4a° a 1)/3 8a* —- 2° 
Now 8a? — 2 = an + O(Vn), so 


a2; 6 
Onin Oe). 
( P = Tn oe 


Therefore, 
lim (dina P Vesaaa = 
n—0 (donin/P)neoaed 


and the limiting coding gain is 10 logiow = 4.9714 --- dB. 
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APPENDIX 
Proof of Equations (1) and (2) 


To prove that 





rn? 
Y¥ o(m) = + O(n log n), 
1lsmsn 32 
m odd 


we write 


» o(m= Y Yea 


l<msn l<msn q|m 


m odd m odd q odd 


=) Yq 


-dgn qsn/d 
d odd q odd 


YF l(n/do + V% 


d<n 


d odd 


where (n/d)o is the largest odd integer < n/d. Then (n/d)p + 1 = 
n/d + uw, where —1 < » <1, and 


n? 1 1 
== —|+0 = 
eaten an) 2 (x d? = 2 d 
m odd d odd d odd 
n°? I 
= — —]+ 
fi (> 7 O(n log n) 
d odd 
n? 1 1 
at i —})+ 01 
4 (2 d? 2. d? oe oe ») 
d even 
Sas | ees Me ee mele ore 
4 d<n d? 4 \d<n/2 d? e 
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n? [ {x 1 [x 
= (= + on) =G 3 +0 am) 


+ O(n log n) 


nn? 
= cy + O(n log n). 
The estimates for partial sums are obtained using Euler’s summation 
formula (see Ref. 7, p. 54). To prove 


rn 





¥ mo(m) = a O(nlog n), 
l<msn 
m odd 
we write 
y mo(m)= Y d XY dq 
l<msn dsn qd<n 
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100-GHz Measurements of Two Astigmatic 
Launchers 


By R. A. SEMPLAK* 
(Manuscript received October 3, 1984) 


Astigmatic launchers that would permit a single earth station antenna to 
communicate with all the satellites along the geosynchronous arc have been 
fabricated and measured at a frequency of 100 GHz. Good agreement between 
measured data and calculated values has been obtained for astigmatic correc- 
tions required by feeds displaced 18 and 29 degrees from the focus. 


1. INTRODUCTION 


For high-capacity satellite communication systems, communication 
satellites are placed at different locations along the geosynchronous 
arc with the usual practice of using a separate earth station antenna 
to communicate with each satellite in the system. If both the satellites 
and earth stations are equipped with multiple-beam antennas, these 
high-capacity communication systems could be achieved by using a 
single earth station antenna and simultaneously communicating with 
all the satellites in the system.’ 

Measurements and theory have indicated that the geometry of an 
offset Cassegrainian antenna results in an ideal configuration?” for 
both earth station and satellite antennas. Since the antenna aperture 
has no blockage, this significantly reduces the sidelobe levels and, in 
turn, reduces interference. However, since only one of the multiple 


beams can be aimed along the axis of the antenna reflector, the 
remaining beams must be displaced from the focus. The loss in 
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APERTURE 
MAIN REFLECTOR 





Fig. 1—Astigmatic correction can be obtained by a feed with two different phase 
centers, F and F’, in the two principal planes of its beam. 


efficiency that these displaced beams exhibit is a function of the 
amount of astigmatism introduced as a result of the displacement 
from the focus. By using a feed with different phase centers in the two 
principal planes of its beam, shown in Fig. 1, one can eliminate the 
astigmatic loss.® For efficient operation over a wide band of frequen- 
cies, both the two phase centers (F, F’) and the beamwidths in the 
two principal planes (0, 0’) must be frequency independent. 

Earlier work by Dragone’ and Chu® shows that frequency-independ- 
ent astigmatic corrections can be obtained by combining a small horn 
with two cylindrical reflectors whose focal lengths are such that a 
magnified image of the feed horn is produced over the main reflector 
aperture.” However, this feed arrangement is not very suitable for an 
earth station antenna supporting multiple beams, since the distance 
between the two phase centers is fixed and cannot be varied after the 
feed is constructed. If one were to vary this distance, the beamwidths 
in the two principal planes would change, causing a reduction in 
aperture efficiency. This is an important restriction, for it implies that 
a given feed can only be used at certain locations in the vicinity of the 
focus; at other locations corresponding to other beam displacements, 
different feed parameters are required, necessitating the design of 
different feeds for different displacements. In addition, a large feed 
aperture is required, along with relatively large dimensions for one of 
the two reflectors. 


Il. DISCUSSION 


Reference 6 describes a single launcher design that overcomes the 
above difficulties and permits the phase center separation to vary 
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while maintaining constant beamwidth in the two principal planes. 
Using the principles of Ref. 6, two launchers (one long and one short) 
were designed and fabricated for operation at 100 GHz. 

The electroformed feed horn used with the launchers is shown in 
Fig. 2. To permit polarization rotation in the rectangular aperture of 
the feed horn, the horn was fabricated in two sections—one section 
tapering down to a square aperture and the second section tapering 
down to rectangular waveguide. The complete feed horn can be seen 
mounted together with the mixer on the short astigmatic launcher 
shown on the right of Fig. 3. 

The long astigmatic launcher, shown without feed horn on the left 
of Fig. 3, has the top parallel plate removed to display the first reflector 
that would be illuminated by the feed horn. Both the short and long 
launchers shown here have identical pairs of reflectors; the only 
difference is the length of the parallel plates. 

The cylindrical wave radiated by the feed horn positioned at the 
focus of the first reflector is guided to the first reflector by the parallel 
plates. After being reflected, the wave is again guided by the parallel 
plates in the direction of the second reflector. After some distance the 
parallel plates are truncated and the aperture illuminated by the 
reflected cylindrical wave is defined by this truncation. The width of 
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Fig. 2—100-GHz feed horn used with the astigmatic launchers. 
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Fig. 3—The short astigmatic launcher (right) and the long astigmatic launcher (left). 


the aperture is defined by the spacing between the two parallel plates; 
the wave radiated by this aperture illuminates the second cylindrical 
reflector. 

To produce an image of the feed horn aperture over the aperture of 
the main reflector, the distances of the phase centers of the feed horn 
and the truncated parallel plate aperture must satisfy the optical thin 
lens equation.® 


Hl. ASTIGMATIC LAUNCHER MEASUREMENTS 


Using the newly constructed anechoic chamber at the radio range 
facilities at Holmdel, New Jersey, measurements were made of the 
radiation characteristics of the two 100-GHz astigmatic launchers 
depicted in Fig. 3. The measured data,* both amplitude and phase, are 
presented in Figs. 4 through 7 for both launchers and are shown by 
the solid curves. The dashed curves are the calculated theoretical 
values. 


* These data were obtained at a distance equivalent to that of the main reflector of 
an offset Cassegrainian Antenna, i.e, the data represent the actual illumination at the 
aperture of the main reflector. 
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Fig. 4—(a) Measurements for an electric field parallel to the plates of the short 
launcher, and the theoretical amplitude distribution. The phase center lies in front of 
the aperture. (Cont.) 


To avoid any difficulty in visualizing the polarization of the electric 
field, the electric field will always be referred to the plane of the 
parallel plates. Therefore, the electric field will be either parallel to or 
orthogonal to the plates. Further, the insert in each of these figures 
shows the position of the launcher with respect to the plane of 
measurements. The location of the phase center is also shown on the 
insert. 

The amplitude measurements shown by the solid curve of Fig. 4a 
were obtained with the electric field parallel to the plates. The agree- 
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Fig. 4—(b) The electric field remains parallel to the plates, but the launcher is rotated 
90 degrees. The phase center lies behind the aperture. 


ment with the theoretical calculations is very good. An examination 
of the phase measurements shows a maximum phase variation of the 
order 6 degrees. Over much of the aperture, the phase is essentially 
constant. 

For the amplitude data shown by the solid curve of Fig. 4b, the 
electric field is still parallel to the plates. As shown by the insert on 
this figure, the launcher is rotated 90 degrees. The dashed curve is 
that for a uniformly illuminated aperture. The measured data, given 
by the solid curve, are in good agreement. From the phase data shown 
here, one sees that the phase change over the aperture is of the order 
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Fig. 5—(a) Measurements for an electric field orthogonal to the plates of the short 
launcher, and the theoretical amplitude distribution. The phase center lies behind the 
aperture. (Cont.) 


8 degrees. However, over most of the aperture the phase is essentially 
constant. 

Measurements made with the electric field orthogonal to the plates 
are shown in Figs. 5a and b. As shown here by the dashed curves, the 
agreement between the calculated values and the measured data is 
very good. From the phase measurements shown on these figures, one 
can see that the phase is essentially constant across the aperture. The 
two pairs of measured data shown by Figs. 4 and 5 are essentially 
identical. 
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Fig. 5—(b) The electric field remains orthogonal to the plates, but the launcher is 
rotated 90 degrees. The phase center lies in front of the aperture. 


The feed horn and mixer assembly were then transferred to the long 
astigmatic launcher shown at the left of Fig. 3. The amplitude and 
phase measurements obtained with the long launcher are shown in 
Figs. 6 and 7 by the solid curves. Again, the dashed curves are the 
theoretical calculations. 

The data shown in Figs. 6a and b were obtained with the electric 
field parallel to the plates. As shown by the inserts on these figures, 
the launcher was rotated 90 degrees to obtain the second set of data. 
Here again, the measurements agree well with the calculated values. 
The phase deviations across the aperture are very small. 
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Fig. 6—(a) Measurements for an electric field parallel to the plates of the long 
launcher, and the theoretical amplitude distribution. The phase center lies in front of 
the aperture. (Cont.). 


The data for the long launcher were completed with the measure- 
ments shown in Figs. 7a and b. For these data the electric field is 
orthogonal to the plates. Again, the agreement between amplitude 
measurements and theoretical calculations is very good and the phase 
variations across the aperture are small. 

A comparison of the data presented in Figs. 4 through 7 for both 
the short and long launchers confirms the fact that the phase center 
separation for this arrangement of launcher can indeed be varied and 
still maintain constant beamwidth in the two principal planes. The 
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Fig. 6—(b) The electric field remains parallel to the plates, but the launcher is rotated 
90 degrees. The phase center lies behind the aperture. 
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Fig. 7—(a) Measurements for an electric field orthogonal to the plates of the long 
launcher, and the theoretical amplitude distribution. The phase center lies behind the 
aperture. (Cont.) 


frequency independence of these launchers was checked over a 20- 
percent band with no discernible change in beamwidth. 

Using the methods described in Ref. 10, the measured phase center 
separation of the short launcher would correct the astigmatism asso- 
ciated with a feed displaced about 18 degrees from the focus, whereas 
the long launcher would correct the astigmatism associated with a 
feed displacement of about 29 degrees. 
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Fig. 7—(b) The electric field remains orthogonal to the plates, but the launcher is 
rotated 90 degrees. The phase center lies in front of the aperture. 


IV. CONCLUSION 


Based upon the data presented here, a multiple-beam earth station 
antenna equipped with launchers of the type described here (where 
only the separation of the phase centers needs to vary) can indeed 
communicate with all satellites along the geosynchronous arc. 
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On the Use of Vector Quantization for 
Connected-Digit Recognition 


By S. C. GLINSKI* 
(Manuscript received June 6, 1984) 


Recent work at AT&T Bell Laboratories has demonstrated the efficacy of 
vector quantization in greatly reducing both the computational and memory 
requirements of isolated-word recognition systems. This efficiency is obtained 
at the expense of a marginal decrease in performance, and thus is an attractive 
approach. The purpose of this paper is to report on the results of a series of 
experiments in the application of vector-quantization strategies to a small- 
vocabulary, connected-word recognition task. Several strategies are investi- 
gated, including the use of speaker-trained code books versus universal code 
books, the use of binary and higher-order tree searches versus full searches of 
these code books, and the quantization of both test and reference frames 
versus reference frames only. For various strategies, the effect on error rate of 
varying the code-book size is also reported. Results indicate that the vector 
quantization approach is attractive for linear predictive coding-based con- 
nected-digit recognition. 


I. INTRODUCTION 


In the area of speech coding, the technique of Vector Quantization 
(VQ) has recently been successfully applied.’ In the standard ap- 
proach, a speech signal is framed and a feature vector is extracted 
from each frame. Each element of the feature vector is then separately 
quantized. In other words, the value of each element is replaced by its 
closest match from a set of discrete values. The set of values is chosen 
to minimize some error criterion while reducing the number of bits 
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required to identify the element value. This feature vector is typically 
a set of Linear Predictive Coding (LPC) coefficients and perhaps an 
energy term. In the VQ approach, a feature vector is extracted as 
before. The entire vector of features is then quantized by replacing 
the vector with its closest match from a set or “code book” of feature 
vectors. Similarly, the entries in the code book are chosen to minimize 
some distortion measure while reducing the number of bits required 
to identify each frame of speech. In addition to reducing the number 
of bits required to represent each feature vector, by using a suitably 
compact code book in place of a large set of reference vectors, it is 
possible to greatly reduce the number of comparisons made between 
the unknown (test) feature vector and the stored-feature vectors. Since 
this comparison (or distortion measure) is the current bottleneck in 
many speech recognition systems, the VQ approach is quite effectively 
used in them. In fact, VQ has been used heavily in several different 
approaches to speech recognition, including Hidden Markov Modeling 
(HMM)?* and Dynamic Time Warping (DTW),° among others.*> 

The purpose of this paper is to report on the results of a series of 
experiments in the application of VQ strategies to a small-vocabulary, 
connected-word recognition task.’ These strategies include the use 
of Speaker-Dependent (SD) versus Speaker-Independent (SI) code 
books; the use of binary and higher-order tree searches versus full 
searches of these code books, as suggested in Ref. 2; and the quanti- 
zation of both test and reference frames versus reference frames only.®” 
The effect on error rate of varying the code-book size is also reported. 
In all experiments, the reference and test data sets are disjoint, and 
code books are trained with reference data. That is, speaker-dependent 
reference templates are quantized by vocabulary-dependent code books 
that are either speaker dependent or speaker independent. All test 
strings consist of deliberately spoken, connected words. 

In Section II some useful terminology is presented. In Section HI 
theory is developed. In Section IV experimental results are presented. 


il. TERMINOLOGY 
This terminology is based, in part, on Refs. 7 and 10. 


Dp LPC order 

M* maximum code-book size in words 

M code-book size in words 2 < M <= M* 

k LogeM code-book rate 

L code-word (reference) index 0 <1= M-—1 
I(k) best code-word match at code book k 

J] training-frame index 1 <j <J 

J number of training frames 
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B branching factor 


a Log,B 

a? (Ll) LPC reference vector 0 <!<p 

ri(L) LPC reflection coefficient vector 0 <1<p 

V; autocorrelation matrix of training frame 

E; LPC residual energy of training frame 

R; (1) autocorrelation vector of training frame 

d(a;, V;) distortion between reference and training frames 
p(l) perturbation vector 

Dy mean distortion over training frames 

€ distortion change threshold (0.01) 

5 perturbation factor (0.01) 

{Tu (i)} set of training vectors whose best match is code word 1 
Cy (2) number of training vectors in {T'y(i)}. 


Ill. VECTOR QUANTIZATION 


The VQ procedure consists of two main parts: code-book generation 
and the classification of test frames. In both cases, a code-book search 
procedure must be employed to classify the training or test (unknown) 
frames, respectively. The code-book generation process will be pre- 
sented first and the search strategy second. 


3.1 Code-book generation 


The basic generation procedure discussed in Refs. 1, 2, and 10 is 
employed and is illustrated in Fig. 1. The flowchart is from Ref. 10, 
but has been generalized to allow B-way splitting of centroids versus 
the original two-way splitting (B is a power of 2 in this work). The 
overall goal is to find a set of code words {a}, such that the mean 
distortion D,, produced by replacing each of the J training frames by 
its closest match from the code book {a}, is minimized. Succinctly 
stated, 


ze 1.2 
D,(M*) = min E y min d(a;, va} (1) 
{a} J jai 1sisM* 

The distortion d can be calculated using the likelihood-ratio-dis- 
tance metric as follows: 

a; Vja; 
d(a;, V;) = psc clr eet Sas 1, 9 

where row m, column n of matrix V; contains (1/N)R;(|m — n|). N 
is the window length and R is the autocorrelation vector. Only the 
spectral-shape information is used in the quantizer. 

In practice, {a} is found by first initializing with a small code book 
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Fig. 1—Code-book generator flowchart. 


of size M = B (where B is typically 2), and then successively splitting 
its code words B ways to obtain larger and larger code books until a 
code book of the desired size is obtained (see outer loop of Fig. 1). 
Each successive code book is corrected by iteratively classifying the 
training set, and adjusting each code word to be the centroid of the 
subset of training frames which best match that code word (inner 
loop). The final iteration is determined as that for which the mean 
distortion changes by less than a threshold ¢ in relation to the previous 
mean distortion. Typically, « = 0.01. 
Initialization of centroids is as follows: 


re(1) = 0.5(-1)?"4 = =0<1< Log,B-1 
= 0.0 Log.B =! Sp, (3) 
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where k = LogoB and 0 <1 Ss B — 1. This splits the LPC reflection 
coefficient space on the first Log.B coordinate axes. For instance, for 
B=4, 

rs = (0.5, 0.5, 0, --- 0) 


ri = (-0.5, 0.5, 0, «++ 0) 
r3 = (0.5, -0.5, 0, --- 0) 
rs = (-0.5, —0.5, 0, --- 0). 
Centroid splitting is done similarly as follows: 
pi(l) = 6(-)'* 4 O<1s LogB-1 
= 0.0 Log,.B =<! sp 
O0<isB-1. (4) 
Then 
reti=rhe(lt+p) OsisB-1 
O<sns2'?-1 (5) 


accomplishes a B-way split of each reflection coefficient vector in code 
book k. The factor 6 is typically set to 0.01. LPC model stability is 
ensured by requiring that -1 <r<1. 

The new centroid (code-word) computation depicted in Fig. 1 is 
done by averaging the normalized autocorrelations of those training 
frames that best match a given code word: 


Ri(l) = (Cui) 2 BPR) 7 © (Tu(i)}. (6) 


This computes the centroid as a spectral average of training frames.” 
The total-average-distortion calculation is 


M 
Dj(M) = M™ x (Cu (i) X d(ai, Vj) GE {Tuli}, — (7) 


where the LPC coefficients a; of eq. (7) are derived from the R; of eq. 
(6). 

The only program step in Fig. 1 not yet discussed is the classification 
of training vectors. Since this is common to both the code-book 
training and ultimate quantization of test vectors, it will be discussed 
in the next section. 


3.2 Classification of frames 


In the original development of VQ, a full search of the code book 
was used to classify training (or test) frames. The best matching code 
word in a k-bit code book is 


CONNECTED-DIGIT RECOGNITION — 1037 


I(k) = argmin [d(a?, V)] Osis M*-1. (8) 


For classification of the test, k = Log,.M*. During the code-book 
generation, k varies from a = Log,B to a = Log,M*, depending on 
how many times the code words have been split. 

However, Fig. 2a shows that a binary tree search is also possible. In 
any given code book (level), only two code words are searched; those 
that were split from the best matching code word in the previous code 
book (level). The search is initialized by setting [(O) = 0, and then 
executing 


I(k) = argmin [d(a?, V)]) 1s k s Log,M* 
2U(k — 1)]) Sis 2U7(kR -—1)] +1. (9) 


The search stops at code book k = Log2M* for test classification, or 
at an intermediate k during code-book training. 

The tree search may be carried out using a different branching 
factor, B (which in this work is specified 1 to be an integer power of 2). 
Initialize J(O) = 0, and then 


I(k) = argmin [d(ai, V)] k =a, 2a, --- LogsM* 
BUI(k — a@)] sis BUI(k—- a) +1]-—1. (10) 


The stopping criteria are the same as for eq. (9). If LogzM* ¥ af for 
some positive integer 8, then the branching factor to the last code 
book (level) is not B = 2°, but B = 2’, y < a, where 


a 3} a7 ats 
k=4 
a k=3 
% k=2 
ay al k=1 
(a) 
a a a5 
a a4 as as k=2 
(b) 


Fig. 2—Code-book tree search: (a) quaternary search; (b) binary search. 
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y {Loss é ao | af. (11) 


Figure 2b shows an example of the tree-search classification proce- 
dure for a branching factor of 4 (B = 4). 

Note that a full search of a k-bit code book requires M = 2" distortion 
- computations, while a tree search of the same requires only B Logg M 
distortion computations with a branching factor of B. Thus, a great 
computational savings is incurred by using the tree-search strategy. 
The two search strategies suggest two approaches to connected-word 
recognition via DTW, as pictured in Fig. 3. In the following discussion 
an LPC distortion will be referred to as a distance, which is more 
common in the jargon of speech recognition. In Fig. 3a (full-search 
strategy) a vector of distances representing the distances between the 
current test frame and the entire code book is passed to the DTW 
procedure.® The vector is sufficient to represent the distance between 
any test and reference template frames. Since a full search was 
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Fig. 3—Quantizer strategies: (a) full search; (b) tree search. 
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employed, all these distances are available along with J(k), the 
best matching code word. In the tree-search strategy (see Fig. 3b) this 
is not the case. Only the index J(k) and corresponding distance 
d(ar.), V;) are produced. Thus, it is necessary to precompute and store 
a cross table of distances between all pairs of code words. This implies 
that both reference and test frames are quantized to the code-book 
entries, unlike in Fig. 3a, where solely the reference frames are quan- 
tized. Thus, the tree-search strategy will require much less computing 
power but probably do more poorly than the full-search strategy, 
because of both the implicit quantization of the test signal and errors 
introduced by the tree search. 

In the HMM approach, however, for the procedure analogous to 
time warping (Viterbi scoring, for example), the only information 
necessary to pass from the quantizer is the index J(k) (no distance 
vector). Thus, there is no need for the precomputed distance matrix 
in the tree-search strategy, and the only error introduced results from 
the tree search. 


3.3 Empty cluster problem 


In practice, during the classification of training frames, it may 
happen that certain code words may not match any training frame. In 
other words, the set {7y,(i)} may be empty for one or more i, and the 
count Cy(i) = 0 for the same i. This is especially a problem as the 
code-book size approaches the number of training frames. In this 
work, the problem is addressed by simply ignoring the empty clusters 
(which reduces the effective code-book size). During a full-search 
strategy, the empty clusters are simply skipped. In a tree-search 
strategy, however, it may happen that as the tree is traversed, a “dead 
end” is reached before the last level is reached. That is, all branches 
from a given node lead to code words corresponding to empty clusters. 
In this case, the test frame is classified as the code word corresponding 
to the last nonempty cluster encountered as the tree was traversed. 

The advantage of this approach is that experiments may be run 
where the number of training frames is fewer than the specified 
number of code words. This is especially true for the speaker-depend- 
ent code-book experiments in this work. 


IV. EXPERIMENTAL RESULTS 


To evaluate the various VQ strategies, the connected-word recog- 
nizer of Ref. 7 was employed. It can be shown that this recognition 
algorithm performs identically to the level-building algorithm of Myers 
and Rabiner! in the case where no constraint is imposed on the 
number of words in the test. When this constraint is removed, the 
level-building algorithm performance is marginally superior.’? The 
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distance measure used in the recognizer was the likelihood ratio of eq. 
(2), clipped at 2.0. The test data consist of 40 random strings, spoken 
by each of 18 different speakers, 9 male and 9 female. These are equal 
numbers of strings of length 2, 3, 4, and 5 digits. The strings used in 
this work are all deliberately spoken (about two words per second). 

The reference data consist of 33 reference templates (about 960 
reference frames) for each speaker. Three groups of 11 templates each 
contain the digits 0 through 9, and one repetition of the digit with an 
unreleased final consonant “t.” The first group includes standard 
isolated-word reference templates; while the second contains em- 
bedded, noncoarticulated, deliberately spoken references; and the third 
contains embedded, coarticulated, normally spoken references. Em- 
bedded templates are those extracted from between two other tem- 
plates in running speech. The three groups are used to compensate for 
differences in speaking rates and coarticulation effects. 

Both the reference and test data were passed through an endpoint 
detector. All the speech data are identical to those used in previous 
work.!? Note that all experiments entail speaker-dependent, con- 
nected-word recognition, and code books are trained with the reference 
template data. Reference templates and test data sets are disjoint. 

For each of the 18 speakers, a speaker-independent code book was 
generated from the reference templates of the other 17 speakers. These 
code books are “vocabulary dependent,” since the same vocabulary 
was included in the code-book training data as in reference and test 
data. 

For speaker-dependent code books, the reference database for an 
individual speaker was used to train the code book for that speaker. 
This corresponds to an implementation in which reference templates 
for a given speaker are created first, these templates are then used to 
train a code book, and then the reference templates are quantized with 
the same code book. Thus, in this case the code books are both speaker 
dependent and vocabulary dependent. 

Parameters of the code-book generator were chosen as follows: 


¢ = 0.01 (distortion change threshold) 
6 = 0.01 (perturbation factor). 
The baseline string error rate for the recognizer is 5.4 percent. This 


baseline error rate is for the case where no vector quantization is used. 


V. SPEAKER-INDEPENDENT CODE. BOOKS 


In the first experiment, speaker-independent code books were used 
to quantize reference frames only (one-sided quantization’). Code 
books of rate 4, 6, 8, and 10 bits were investigated for both binary and 


CONNECTED-DIGIT RECOGNITION 1041 


A FULL SEARCH 
O BINARY SEARCH 


STRING ERRORS IN PERCENT 
MEAN DISTORTION 





RATE IN BITS PER FRAME 


Fig. 4—Recognizer performance and code-book distortion for one-sided VQ and SI 
code books. 


full-search classification during code-book generation and reference 
quantization. The mean distortion of the binary and full-search code 
books is shown in Fig. 4. This is the mean distance D; between the 
training vectors and their best matching code words. The recognizer 
performance, as a function of code-book rate, is shown in Fig. 4 and 
listed in Table I. Results show that: (1) Only the use of a 10-bit, full- 
search code book results in performance approaching the baseline 
performance. Other rates result in approximately a doubling (or worse) 
in error rate. (2) About a 1-bit savings in code-book rate is obtained 
by using a full-search strategy versus a tree-search strategy, while 
recognizer performance is held constant. 


VI. SPEAKER-DEPENDENT CODE BOOKS 


The second experiment was identical to the first except that speaker- 
dependent code books were used in place of speaker-independent code 
books. The mean distortion of the binary- and full-search code books 
is shown in Fig. 5. The recognizer performance as a function of code- 
book rate is shown in Fig. 5, and listed in Table I. Results indicate the 
following: (1) The recognizer shows little loss in performance at rates 
of 8 bits and higher. This represents a compression on the reference 


Table I—Recognizer performance 
Percent String Errors 
Rate (Bits/Frame) 


Code Cae ree a Se ES 

Book VQ Search 4 6 8 10 
SI 1 Side Full 28.3 15.1 11.0 8.1 
SI 1 Side Binary 34.9 18.2 12.1 10.1 
SD 1 Side Full 15.3 9.3 6.0 5.7 
SD 1 Side Binary 22.1 9.6 6.4 5.6 
SD 2 Side Full 21.0 10.8 7.5 5.8 
SD 2 Side Binary 31.2 14.7 10.7 9.3 
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Fig. 5—Recognizer performance and code-book distortion for one-sided VQ and SD 
code books. 
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Fig. 6—Recognizer performance for two-sided VQ and SD code books. 


database of about 4:1 for each speaker. (2) The binary tree-search 
strategy rivals the full-search strategy with regard to recognizer per- 
formance at rates of 6 bits and up. (3) For a given performance, the 
SD code book incurred a bit-rate savings from 1 bit (at a low-bit rate) 
to 3 bits (at a high-bit rate) over the SI code book. 


VII. REFERENCE AND TEST QUANTIZATION 


The third experiment was identical to the second, except that both 
reference and test frames were quantized (two-sided quantization’). 
Recognizer performance is plotted in Fig. 6 and listed in Table I. The 
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Table II—Recognizer performance using tree-search strategies 
Branching Factor 


Full search 2 4 8 16 
6.0 6.4 6.5 6.9 7.4 % String errors 


results show the following: (1) Two-sided quantization in the full- 
search case rivals one sided for bit rates of 6 and up. (2) The binary- 
search case does not compete as favorably with the full-search ap- 
proach as in experiment 2. 


Vill. TREE-SEARCH EXPERIMENT 


In experiment 4, experiment 2 was repeated for the 8-bit code-book 
rate, while the branching factor B in the tree-search approach was 
varied between 2 and 16. Only reference frames were quantized (SD 
code book). Performance data listed in Table II and plotted in Fig. 7 
indicate that performance drops off only slightly with an increasing 
branching factor. 


IX. CONCLUSIONS 


Four experiments were run to determine the efficacy of VQ as 
applied to a small-vocabulary, connected-word recognition task. Re- 


50 


STRING ERRORS IN PERCENT 


2 4 8 16 
BRANCHING FACTOR 


Fig. 7—Recognizer performance for tree-search strategies using one-sided VQ and 
SD code books at 8 bits per frame. 
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sults show that both full- and binary-search, one-sided VQ, at code- 
book rates of 6 bits and higher, offers an attractive cost-performance 
trade-off in an LPC-based connected-digit recognizer. Full-search two- 
sided VQ is also a competitive approach. For tree-search strategies, 
recognizer error rate is relatively insensitive to branching factor. 
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Incorporation of Temporal Structure Into a 
Vector-Quantization-Based Preprocessor for 
Speaker-Independent, Isolated-Word 
Recognition 
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Recently a new structure for isolated-word recognition was proposed in 
which a separate Vector Quantization (VQ) code book was designed for each 
word in the vocabulary. The word-based VQs were used as a front-end 
preprocessor to eliminate word candidates whose distortion scores were large; 
a dynamic time-warping processor then resolved the choice among the re- 
maining word candidates. The above scheme worked very well for small 
vocabularies; however, the major flaw was the lack of temporal information in 
the word-based VQ processor. As such, as the vocabulary grew in size and 
complexity, the ability of the VQ processor to resolve among similar sounding 
words decreased dramatically, and the effectiveness of the proposed recogni- 
tion structure similarly decreased. To alleviate this difficulty a technique for 
incorporating temporal structure into the preprocessor is proposed. In partic- 
ular, the probability density function of the time of occurrence for each vector 
in the code book is estimated from a training sequence. In the recognizer, the 
spectral distance score of the VQ is combined with a temporal distance score, 
for each frame in the word. An evaluation of the modified recognizer showed 
slightly improved performance on the digits vocabulary and greatly improved 
performance on a vocabulary of 129 airlines terms. 


Il. INTRODUCTION 


There has been a great deal of interest recently in isolated-word 
recognition techniques that maintain high performance, but do so at 


* Authors are employees of AT&T Bell Laboratories. 
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low computational cost.’° The reason for this renewed interest in 
“low-cost” recognizers is the desire to implement such systems on 
conventional microprocessors, where the computational power is no- 
where near as great as needed for the “higher-cost” recognition sys- 
tems. 

One of the most promising of the low-cost recognizers is the Vector- 
Quantization (VQ)-based recognizer, originally proposed by Shore and 
Burton,” and modified by Burton et al.* and Pan et al.° The basic idea 
in this recognition system is to design a separate VQ code book for 
each word in the vocabulary, based on a training sequence of several 
tokens of each word by one or more talkers. In the original Shore and 
Burton implementation,” the recognizer chose the word in the vocab- 
ulary whose average quantization distortion (according to its particular 
code book) was minimum. In the implementation of Pan et al.,° the 
word-based VQs were used as a front-end preprocessor to eliminate 
word candidates whose distortion scores were large; a Dynamic Time 
Warping (DTW) processor then resolved the choice among the re- 
maining word candidates. 

Both of the above implementations of the word-based VQ recognizer 
worked very well for small vocabularies; however, as the vocabulary 
size and/or complexity grew, the ability of the VQ processor to resolve 
among similar sounding words decreased dramatically, and the effec- 
tiveness of the recognizer similarly decreased. 

The major problem with the word-based VQ processor, for large 
vocabularies, was its inability to use temporal information, i.e., to 
integrate information about the times of occurrence of the speech 
sounds with the fact that the sounds occurred within the word. One 
simple method for incorporating this type of temporal information 
was proposed by Buzo et al.,° and developed by Burton et al.* In this 
approach, gross temporal information was incorporated into the re- 
cognizer by subdividing each input word into R nonoverlapping re- 
gions, using a separate code book for each region. In this manner each 
word was characterized by R code books, obtained from a training 
procedure in which a similar subdivision of each training word was 
made. Burton et al. reported good success with this method.* 

An alternative procedure for incorporating temporal information 
into the VQ-based preprocessor is proposed in this paper. In particular, 
for each vector in each word-based code book, the probability density 
function of the time of occurrence (on a normalized time scale) is 
estimated from the same set of training sequences used to derive the 
code-book vectors. In the recognizer, the spectral distance score of the 
VQ preprocessor is combined with a (scaled) temporal distance score, 
for each frame in the word. We use the structure of a preprocessor to 
screen out unlikely word candidates (based on the combined spectral 
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and temporal distance), and resolve the fine word distinctions with a 
DTW processor. 

An evaluation of the modified recognizer structure, described above, 
was performed using both a small vocabulary (the 10 digits), and a 
moderate-size vocabulary (129 airline terms). Both vocabularies were 
tested in a speaker-independent mode, i.e., code books and probability 
histograms were generated from speaker-independent training sets. 
Results showed recognition error performance on both vocabularies 
was comparable to that of the best recognizers; however, computational 
costs were comparable to those of a “low-cost” recognizer. 

The organization of this paper is as follows. In Section II we discuss 
the proposed recognition algorithm, which combines temporal infor- 
mation along with the spectral information of the word-based VQ 
preprocessor. In Section III we describe an experimental evaluation of 
the new recognition structure. In Section IV we review the results of 
the evaluation, and discuss potential ways of lowering the cost of the 
recognizer even further. Finally, in Section V, we summarize our 
findings. 


Il. THE PROPOSED RECOGNITION ALGORITHM 


A block diagram of the proposed recognizer is given in Fig. 1. The 
input speech signal is digitized at a 6.67-kHz rate, the word endpoints 
(beginning and ending frames) are detected, and a Linear Predictive 
Coding (LPC) analysis is performed on all frames within the word. 
The LPC analysis is an eighth-order analysis of 45-ms frames (300 
samples), spaced every 15 ms (100 samples) along the word. Each 
overlapping 45 ms section of speech is windowed using a Hamming 
window, and an eighth-order autocorrelation analysis is performed 
(giving nine autocorrelation values per frame). The results of the LPC 
analysis are the set of frame log energies (suitably normalized to the 
peak log energy of the word), H;, 1 < i < J, and the LPC vectors a;, 
1 <1 <I, where J denotes the number of frames in the word. 
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Fig. 1—Block diagram of isolated-word recognizer that incorporates a word-based 
vector quantization preprocessor and a dynamic time-warping processor. 
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The word-based LPC preprocessor uses the analysis results (i.e., the 
frame log energies and the LPC vectors) to eliminate all unlikely 
candidates from further analysis. Thus the output of the preprocessor 
is a list of candidates for the unknown word. A DTW processor then 
decides among the words in the candidate list using a conventional 
dynamic time-warping alignment of the unknown test word against a 
set of stored word reference patterns. A K Nearest Neighbor (KNN) 
decision rule chooses the word whose average DTW distance of the K- 
best word patterns is smallest. In cases where the list of candidates 
from the preprocessor contains only a single choice, the DTW proc- 
essor is bypassed and a final decision is made by the preprocessor. 


2.1 The word-based VQ preprocessor 


A block diagram of the word-based VQ preprocessor is given in Fig. 
2. Each word in the vocabulary is characterized, in the preprocessor, 
by a code book, B, and by a temporal probability table, P. The code 
book consists of a set of LPC vectors (supplemented by a log energy 
scalar), by, 1 = k Ss L, which characterize the LPC vectors of a training 
set of multiple occurrences of the word. The code-book vectors are 
chosen by a VQ design algorithm, which minimizes the average dis- 
tortion between the training vectors and the code-book vectors.”® 
Typically, for word recognition applications, values of L (the total 
number of vectors in each word code book) range from 4 to 32. 
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Fig. 2—Block diagram of the word-based vector quantization preprocessor, which 
combines spectral and temporal distance scores. 
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The temporal probability table, P, is derived from both the code 
book, B, and the word training data in the following way. The elements 
of P are the values p;(t), defined as: 


pDx(t) = probability that the code-book vector, k, occurs at normal- 
ized time t = i/] within the word. 


Thus the values p,(t) (where suitably quantized values of t are used 
in practice) constitute a temporal probability table for the code-book 
vectors. The way in which values of p;,(t) are obtained, from the 
training set, is as follows: 

1. Each training sequence is linearly warped to a fixed length, f = 
40 frames. (‘Thus values of p;(t) are obtained for t = 1/40, 2/40, --- , 
40/40.) 

2. Each vector of each linearly warped training sequence is vector 
quantized, using code book B. 

3. At each time ¢, all code-book vectors whose spectral distortion 
distance score is within a fixed threshold, A, of the minimum distortion 
score for the frame are considered to have occurred. 

4. The value used for p;(t) is the ratio between the number of times 

code-book vector k occurred at time t (as defined in step 3 above), and 
the number of times any code-book vector occurred at time t, over the 
entire training set for the word. In this manner Y‘4_, p,(t) = 1 for 
all t. 
To illustrate the results of the above procedure, Fig. 3 shows the 
resulting p,(t) temporal probability tables for an L = 8 vector code 
book for the word six with a training set of 150 tokens of the word 
derived from 150 different talkers (75 male, 75 female). A value of 
A = 0.25 was used in computing p;(t). Experimentation with A showed 
the resulting temporal probability tables were insensitive to A over a 
broad range; this was because with L = 8 (or 16) vectors, generally 
there was only a small fraction of the code-book vectors whose distor- 
tion scores were low. Given a large enough training set, the exact value 
of A (as long as it was relatively small) is almost irrelevant. 

The temporal probability tables of Fig. 3, for the word six, show 
that a smooth probability density was obtained for all vectors. Further, 
we see that for some vectors a unimodal distribution resulted; for 
other vectors distinct multimodal distributions are found. In this 
example, the code-book vectors whose sounds represent the vowel /I/ 
have a unimodal distribution, since this sound occurs only at a single 
place in the word six. Code-book vectors whose sounds represent the 
fricative /s/ have a distinct two-mode distribution, since /s/ occurs at 
both the beginning and end of the word six. Finally, code-book vectors 
whose sounds represent silence have three modes, since silence can be 
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Fig. 3—Estimates of temporal probability density functions for the eight code-book 
vectors of the digit six. (Forty-frame normalization of the word duration is assumed.) 


found at the beginning, end, and in the stop gap of the word six. All 
three types of distributions are clearly seen in the data of Fig. 3. 

For convenience, and to reduce computation, the temporal proba- 
bility tables were stored as 


pr(t) = —y log[p,(t)], (1) 


iLe., as negative log probabilities, so they could be combined readily 
with the LPC distances. The multiplier, y, was chosen so that, aver- 
aged over the entire training set, the average value of p,(t) was the 
same as the average LPC distance. Typically, the value of ~ was about 
0.45 for L = 8 vector code books, and about 0.22 for L = 16 vector 
code books. Also, values: of p,(t) were clipped at a level of 10~*; hence 
no temporal probability score was 0. 


2.2 Combining LPC distance and temporal probability score 


After a great deal of investigation into ways of combining LPC 
distance and temporal probability scores, the resulting distance score 
that was used was . 


d(ai, Ei, B, P) = (1 ~ a)dsp(ai, Ei, B) + adrp(k;, P), (2) 
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where dsp was the spectral (LPC combined with energy) distance and 
dyp was the temporal probability distance. The scaling value a was 
chosen by optimization and determined the mix of spectral and tem- 
poral “distances.” A value of a = 0 represents pure spectral distance; 
similarly, a value of a = 1.0 represents pure “temporal distance.” 

The spectral distance, which combined the LPC distance with the 
energy distance, had the form 


dsp(a:, E;, B) = min [diec(a;, bz) + cf (de( Ei, E,)], (3) 
where 


dypc(ai, by) = (ioe 


with V,, being the autocorrelation matrix of the input frame, E; being 
the normalized log energy of the input frame, and E;, being the 
normalized log energy of the kth code-book vector. We then have 


dy(E;:, E.) = | Ex — Ei|, (5) 
with 
0 O0O<E< Eo 
f(E) =) E- Eto Eto < E s Evy (6) 


Evan — Eto Em < E, 


where c, Exo, Ey, and Eor were suitably chosen constants. (We used 
c=0.1 Eto =6 dB, and Fur = 20 dB.) 
The temporal distance of eq. (2) was of the form 


drp(ki, P) = pr([t|Z]), (7) 


where [i | J] is the rounded value of i/J to the nearest 1/40. 

The following sequence of steps was required to generate a combined 
distance score in the preprocessor: 

1. Vector quantize the input frame (by each word-based code book), 
at time ¢ = 1/I, consisting of LPC vector a; and normalized energy E;, 
and determine the minimum spectral distance, dsp, and the index of 
the best code-book vector, k;. 

2. Access the temporal distance as p,,(t), where ¢ is quantized to the 
nearest 1/40 (since tables with 40 entries were used). 

3. Combine dsp and dp according to eq. (2). 

The above procedure is performed at each frame for each word in the 
vocabulary, and the resulting distance scores are accumulated for each 
word, as shown in Fig. 2. 

The preprocessor decision logic is essentially the same as used by 

Pan et al.,° namely: 
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1. Find all word candidates v, such that the average distortion, D”, 


I 

D° ==). dai, Ei, BY, P’) (8) 
i=1 

is within a fixed threshold, 6, of the minimum average distortion across 

all words. 

2. If only a single word candidate exists, then the recognition is 
over—i.e., no DTW processing is required. 

3. If more than one word candidate exists, then use the DTW 
processor to make the final recognition decision among the word 
candidates. 

We now describe the results of a series of experiments designed to 
evaluate the performance of the overall recognizer of Figs. 1 and 2. 


Ill. EXPERIMENTAL EVALUATION 


Two databases were used to evaluate the performance of the recog- 
nizer. All recordings were made over a standard, local, dialed-up 
telephone line. The first database was a digits set consisting of four 
sets of 1000 digits each (100 talkers-10 digits/talker). We call the 
digits sets DIG1, DIG2, DIG3, and DIG4. Their characteristics are as 
follows: 

DIG1—100 talkers (50 male, 50 female), 1 replication of each digit 
by each talker.’° These recordings have been used as a training set in 
a wide variety of evaluations of isolated-word recognizers. 

DIG2—Same 100 talkers and recording conditions as DIGI; record- 
ings made several weeks later than those of DIGI. 

DIG38—100 new talkers (50 male, 50 female), 1 averaged occurrence 
of each digit by each talker obtained from averaging a pair of robust 
tokens of the digit.’’!” The transmission conditions (i.e., analog front 
end, filter cutoff frequencies, etc.) differed slightly from those used in 
recording the DIG1 and DIG2 databases. 

DIG4—A second group of 100 new talkers (50 male, 50 female), 20 
recordings of each digit by each talker.!? A random sampling of 1 of 
the recordings of each digit by each talker was used. The transmission 
conditions differed substantially from those used in recording the 
other databases. 

The templates (12 per word, speaker independent) for the DTW 
processing were created from the data of set DIG1. The training data 
for the word-based VQ preprocessor (to get the code books, B’, and 
the temporal probability tables, P”) were derived from a randomly 
chosen set of 150 tokens of each word from sets DIG1, DIG3, and 
DIG4. (Of course these same training data could have been used to 
create the speaker-independent reference templates for DTW process- 
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ing; however, a conveniently available template set was used.) For 
testing the recognizer, all four digit sets were used. 

The second database was a vocabulary of 129 words used in an 
airlines information and reservation system.’ Two sets of data, called 
AIR1 and AIR2, were used. Their characteristics were: 

AIR1—100 talkers (50 male, 50 female), 1 averaged occurrence of 
each word by each talker obtained from averaging a pair of robust 
tokens of the word.” 

AIR2—20 new talkers (10 male, 10 female), 1 replication of each 
word by each talker. The data of set AIR1 were used to create both 
the word reference templates (speaker independent, 12 per word), and 
to give the word code books and word temporal probability tables. The 
data of set AIR2 were used to test the recognizer. 


3.1 Results on the digits vocabulary 


For each of the digit test sets, a preliminary test run was performed 
in which the preprocessor was used by itself to make the final recog- 
nition decision based on the word with the lowest combined spectral 
plus temporal distance score. (Equivalently, 6, in the decision logic, 
was set to 0.) The distance combining parameter, a, in eq. (2) was 
then varied from 0 to 1 (in steps of 0.1) and a curve of the preprocessor 
recognition accuracy versus a was computed. A typical such curve for 
the test set DIG1 is given in Fig. 4. The behavior of the recognition 
rate, shown in this figure, is typical for all the digit test sets. It can be 
seen that for a = 0 (only spectral distance) and for a = 1.0 (only 
temporal distance), the recognition rate of the preprocessor (91.4 
percent for a = 0, 91.2 percent for a = 1.0) is significantly lower than 
its value at the peak of the curve (97.5 percent for a = 0.7). This result 
strongly points out the value of combining spectral and temporal 


1.0 


PROBABILITY CORRECT 


0.9 
0 0.2 0.4 0.6 0.8 1.0 


a 
Fig. 4—Curve of average digit recognition rate versus the combining multiplier, a, 


for the data of test set DIG1. (Note that a = 0 corresponds to pure spectral distance 
and a = 1.0 corresponds to pure temporal distance.) 
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Table I—Average error rates for digits vocabulary 
Average Digit Error Rate (%) 


Codes 2 ee 
Book Size DIG1 DIG2 DIG3 DIG4 Overall 
(a) Processor Alone 
8 2.5 3.1 3.1 4.3 3.3 
16 2.5 2.6 2.5 3.7 2.8 
(b) Complete Recognizer 
8 2.9 2.5 2.2 2.9 2.6 
16 1.3 2.3 2.2 2.8 2.2 


distances in the preprocessor. It also can be seen that in the vicinity 
of the peak (near a = 0.7), the recognition rate is fairly constant (its 
value at a = 0.5 is 97.1 percent); hence, a fairly broad region of choices 
for a is possible. Across the four digit test sets, the optimum value of 
a varied from 0.4 to 0.7. If we used the value a = 0.5 for all digit sets, 
the preprocessor recognition rate changed less than 0.2 percent, on 
average. 

A complete set of performance results on the digits test sets is given 
in Table I. Table Ia gives average digit error rates for the preprocessor 
working without the DTW processor, for the four test sets (and an 
overall average), for code books with 8 and 16 vectors per word. The 
average digit error rate is 3.3 percent for 8 vector code books, and 2.8 
percent for 16 vector code books. Table Ib gives average digit error 
rates for the complete recognizer, as a function of code-book size. The 
threshold, 5, in the preprocessor was set so that, on average, about 83 
percent of the time no DTW was required (i.e., the preprocessor made 
the final decision), and about 17 percent of the time, the average 
number of word candidates passed on to the DTW processor was 2.25. 
No quantization of the reference templates in the DTW processor was 
used; previous experience with this data set indicates that no degra- 
dation need occur if the reference template quantization is done 
correctly.° 

From Table Ib it can be seen that the entire recognizer achieved an 
average digit error rate of 2.6 percent for L = 8 vector code books, and 
2.2 percent for L = 16 vector code books. These results represent 
improvements of 0.6 to 0.7 percent in word accuracy; for 4000 test 
digits, such a result is statistically significant. 


3.2 Results on the airline vocabulary 


For the airline vocabulary, a curve of preprocessor average perform- 
ance versus the combining multiplier a was again run, and the results 
are given in Fig. 5. Although the form of the curve is similar to that 
of the digits case (Fig. 4), performance improves significantly when 
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0.8 


PROBABILITY CORRECT 


0.6 


Qa 
Fig. 5—Curve of average word recognition rate versus the combining multiplier, a, 
for the airline test data. 


Table II—Average word error 
rates for the airlines 


vocabulary 
Average Word Error 
% 

Cee ee 

Book Preprocessor Total 
Size Alone Recognizer 

8 14.8 11.7 

16 11.9 8.9 


using both spectral and temporal distance, as opposed to either spectral 
or temporal distance alone. We see from Fig. 5 that for a = 0 (spectral 
distance only), the preprocessor achieves a 65.4-percent accuracy; for 
a = 1.0 (temporal distance only), the accuracy is 73.2 percent (it is 
better than the result for a = 0). However, for a = 0.5, the combined 
distance yields a performance of 88.1-percent word accuracy, an im- 
provement in accuracy of from 15.5 to 22.7 percent over the individual 
distances. 

The overall recognizer performance on the airline vocabulary is 
given in Table II. The 8-vector-per-word system has a preprocessor 
error rate of 14.8 percent, whereas the 16-vector-per-word system has 
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a preprocessor error rate of 11.9 percent. By setting the preprocessor 
decision threshold so that a unique decision was made by the prepro- 
cessor on 76 percent of the trials, and on 24 percent of the trials, an 
average of 2.5 candidates (out of 129 possible) were passed on to the 
DTW processor, the overall word error rates fell to 11.7 percent for 
the 8-vector code books, and to 8.9 percent for the 16-vector code 
books. 


3.3 Typical recognition example 


To illustrate how the addition of temporal information aids the 
preprocessor, Fig. 6 shows a recognition case in which a word (the 


LOG 
ENERGY 


~45 Lt 


ACCUMULATED 
SPECTRAL TEMPORAL SPECTRAL 
DISTANCE DISTANCE DISTANCE 


ACCUMULATED 
COMBINED 
DISTANCE 





FRAME NUMBER 


Fig. 6—The enhanced recognition performance obtained by combining temporal and 
spectral distances in the preprocessor: (a) the test word (zero) log energy contour; (b) 
spectral and (c) temporal distances on a frame-by-frame basis (solid curve is for the 
word three, dashed curve is for the correct word zero); (d) accumulated spectral and (e) 
accumulated combined distance scores. 
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digit zero) would have been misrecognized (as the digit three) based 
on VQ spectral distances alone, but is correctly recognized based on 
the combination of spectral and temporal distance. Shown in this 
figure are the log energy contour of the test word zero (Fig. 6a), the 
VQ spectral distance (frame by frame) for both the word zero (dashed 
line) and the word three (solid line, Fig. 6b), the log probability 
distances for both words (Fig. 6c), the accumulated spectral distance 
scores for both words (Fig. 6d), and the combined, accumulated total 
distance scores for both words (Fig. 6e). On the basis of VQ spectral 
matches, the preprocessor would have made a hard error since the 
distance for zero was not close enough to the distance for three; 
however, using the combined distance the correct word zero was 
uniquely recognized. The reason that the temporal distance helped so 
much, in this case, was the large temporal distance during the /O/ 
vowel in zero for the word three. Thus, although there is a code-book 
vector that matches the /O/ spectrum well in three, the probability of 
it occurring at the end of the word is very small. Although this example 
is an extreme case, it does illustrate well why the addition of temporal 
information to the preprocessor can help the performance to improve. 


IV. DISCUSSION 


The results presented in the previous section clearly show that the 
addition of temporal information to a word-based VQ preprocessor 
increases the accuracy of the recognizer and makes it more robust to 
vocabulary size and complexity. 

To gain perspective on how the current system performance com- 
pares with previous recognizers, Table III gives digit recognition error 
rates for the current system, for the best DTW recognizer,’ and for a 
previous recognizer using a word-based VQ preprocessor.’ Similarly, 
Table IV gives word recognition error rates, for the airline vocabulary, 
for the current system, for the best DTW recognizer, and for a previous 
recognizer using a word-based VQ preprocessor.” 

For the digits, the DTW system performs slightly better, on average, 
than the current system. However, the best performance is on test 
sets DIG1 and DIG2, from which the word reference templates were 
derived. On the test sets DIG3 and DIG4, the current system performed 





Table I!|—Average digit error rates for three recognition systems 
Average Digit Error Rate (%) 

Recognizer DIG1 DIG2 DIG3 DIG4 Overall 
Current system 1.3 2.3 2.2 2.8 2.2 
DTW alone 0.0 0.6 2.7 3.9 1.8 
Previous recognizer — 2.0 — — — 
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Table !V—Average word error 
rates for three recognition 
systems for the airline vocabulary 


Average 
Word 
Error Rate 
Recognizer (%) 
Current system ' 8.9 
DTW alone 10.2 
Previous recognizer 12.6 


slightly better than the DTW recognizer. On the test set DIG2 (which 
was the only common one between the current system and the previous 
recognizer with the preprocessor), the system performances were es- 
sentially the same. 

For the airline vocabulary we see that the error rate of the current 
system is 1.3 percent lower than that of the DTW recognizer alone, 
and 3.7 percent lower than the previous recognizer based on the VQ 
preprocessor. For this vocabulary a real performance improvement 
has been achieved. 


4.1 Computational considerations 


It remains for us to show that this increase in system performance 
is achieved at essentially no increase in system cost (i.e., computational 
complexity). To do this we define the following system variables: 


= Code-book size 

= Vocabulary size 

= Average number of frames in a word 

= Number of templates per word in DTW 

= LPC order 

= Average fraction of words that are resolved in the preprocessor 

= Average fraction of words passed on to DTW processor, when 
more than a single word candidate exists. 


BRTDONaY 


The computation of the preprocessor can be expressed as: 
Cpre = V-I-L-(p +1) *, + 
and the computation of the DTW postprocessor is 
Cpost = (1 — y)8Cprw, 


where 


I 
Corw = V-Q- 3 (P +1) *, +. 
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The overall computation of the recognizer is 


Cr = Cprz + Cposr 


= V-I-(p + 1) (u+ea-ns2). 


The ratio between the full DTW computation (without a preprocessor) 
and the current recognizer computation is then 


pa cow 38) 
EeOUA aE (2) 


Substituting typical values of Q = 12, J = 40, p = 8, L = 8 (or 16), 
(1 — y) = 0.25, 8 = 0.02, we get 


R = 20 (L = 8) 
= 10 (L = 16). 


Thus, a computational reduction (over a standard DTW recognizer) 
of from 10 to 20 times is achieved by the proposed recognizer. 


4.2 Further computational reduction via universal code book 


Although the performance of the proposed recognizer is impressive, 
it is possible to reduce its computational complexity even further. If 
we analyze the computation above, the major computation is in the 
preprocessor, where a total of V-L dot product distances need to be 
computed for each test frame. In the case where V is large (e.g., the 
129-word airline vocabulary), the total number of code-book vectors 
becomes large. In such a case it would be less expensive to use a 
universal code book (word and talker independent) of say 1024 vectors, 
and to choose the word-based code books from the universal code 
book. In this manner the number of distance computations per frame 
is fixed, and does not grow with the vocabulary size V. Of course, it 
must be shown that performance will not degrade, but it seems 
reasonable that for a sufficiently large code book, this will indeed be 
the case. 


V. SUMMARY 


In this paper we have shown how the addition of temporal infor- 
mation into the preprocessor of an isolated-word recognizer can im- 
prove the system performance and make the overall recognizer more 
robust to vocabulary size and complexity. The way in which the 
temporal information was added was straightforward; namely, we 
defined and measured from a training set a probability density function 
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on the time of occurrence of the code-book vectors in the word-based 
VQ preprocessor. A temporal distance was defined as the scaled, 
negative log probability of the probability of occurrence of the vector 
chosen by the vector quantizer. A combined measure in which the 
spectral distance (from the VQ) was added to the temporal distance 
was used in the recognizer and shown to improve performance for 
both a digits and moderate-size airline vocabulary. Finally, it was 
shown that, on average, the computational complexity of the resulting 
recognizer was less than that required for a conventional dynamic 
time-warping implementation by at least a factor of ten, whereas the 
recognition performances of the two systems were comparable. 
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A New Light Pen With Subpixel Accuracy 


By M. HATAMIAN*% and E. F. BROWN? 
(Manuscript received September 13, 1984) 


A new light pen system using a real-time algorithm for computing the 
centroid of the intensity pattern seen by a photosensor is described. This new 
light pen achieves an accuracy of better than one-quarter of a pixel on a 
cathode-ray-tube screen with 1K xX 1K resolution. This corresponds to a 
capability of resolving 0.004 inch on a 15-inch screen, which is an improvement 
by a factor of at least 50 over the conventional light pens available today. The 
transistor-transistor logic hardware implementation of the algorithm is de- 
scribed in detail. The central part of the hardware is a real-time, moment- 
generating circuit that implements an efficient moment calculation algorithm 
reported earlier. Issues such as complexity of the hardware compared with the 
conventional techniques, and the possibility of implementing the algorithm in 
a single complementary metal-oxide semiconductor chip are addressed. Some 
line and curve drawing results sketched by the new light pen are presented 
and compared to similar drawings obtained by a conventional light pen. 


I. INTRODUCTION 


The light pen, an input device for graphics displays and work 
stations, has been around for quite some time. Among the many input 
devices for graphics systems, the light pen is probably the most natural 
one to use and, other than the touch-sensitive screen, the only one 
that directly interacts with the screen. In the world of interactive 
computer graphics, the light pen has been used mostly as a selection 
device for pointing to objects and characters on a Cathode-Ray-Tube 
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(CRT) screen. Although very simple and inexpensive, the light pen 
has not been very popular among the users of graphics systems. Some 
have called it a clumsy, fragile device and predicted that it will become 
less and less popular as raster graphics grows.’ One of the main reasons 
for the light pen’s poor popularity is its limited resolution, which 
makes it unsuitable for accurate pointing, or writing and sketching on 
a CRT screen. Provided with improved resolution capability, the light 
pen is useful in a variety of applications, such as telewriting, accurate 
pointing and selecting, facsimile, brushing, and graphics generation. 
An example of the light pen used in a telewriting system is described 
in Ref. 2. 

The poor resolution of the light pen is a combined result of its 
relatively large field of view, the signal-to-noise ratio performance of 
the pen’s photosensor, limited image sharpness,” and, most important, 
the simple thresholding technique used to detect the position of the 
pen. We recently proposed a new approach to estimating the position 
of the pen by processing the analog signal generated by the light pen’s 
photosensor and computing the centroid of the two-dimensional image 
received by the sensor. Using this new technique we can achieve 
subpixel accuracies on a 1K X 1K screen. The algorithm is extensively 
treated in Ref. 4. 

This paper describes the real-time hardware implementation of the 
algorithm. First, we briefly describe the operation of the currently 
used light pen devices. In Section III we describe the new algorithm. 
Section IV discusses the hardware implementation and Section V 
presents some results. 


Il. CONVENTIONAL APPROACH 


Conventional light pens used in today’s computer graphics systems 
employ a very simple circuit for detecting the position of the pen on 
the display screen. This is illustrated in Fig. 1. A photosensor is placed 
in the tip of a penlike housing. Whenever the scanning electron beam 
falls in the photosensor’s field of view, it generates an analog signal 
whose amplitude is a function of the light intensity received by the 
sensor. This analog signal is compared with a predefined threshold, 
and a pulse is generated indicating a hit. This pulse latches the values 
of two counters (the X-Y counters in Fig. 1) that track the horizontal 
and vertical positions of the beam. The two counters are reset at the 
beginning of each scan, namely, when the beam is at the top-most left 
corner of the screen. The X counter tracks the position of the beam 
on each line and is reset at the end of the line; the Y counter indicates 
the number of the line currently being scanned by the beam. 

Obviously, when the hit pulse is generated, the values of the X and 
Y counters indicate the coordinates of the hit point. These coordinates 
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Fig. 1—Conventional light pen system. 


are latched into two registers to be used by a host processor (e.g., the 
graphics station’s display processor). Using this simple thresholding 
technique, the accuracy in estimating the position of the light pen is 
a function of the photosensor’s signal-to-noise ratio and the aperture 
of the photosensor (size of the field of view). The latter is more 
important. Many attempts have been made to improve the accuracy 
of the light pen by controlling the field of view of the pen and using 
photosensors with high sensitivity. 

Designers have employed various mechanical and optical arrange- 
ments in the tip of the light pen to improve the accuracy. All of these 
attempts, although successful in improving the performance, have not 
achieved the ultimate desired accuracy. To the best of our knowledge, 
the best of today’s available light pens do not have a repeatable 
accuracy of better than +5 pixels (in both directions) on a screen with 
a resolution of 500 X 500 pixels. Besides, even this poor accuracy is 
only achievable when the pen is held perpendicular to the screen and 
not at an angle. 

Such poor accuracy in estimating the position of the light pen is 
mainly due to the simple thresholding technique used in detecting the 
hit point. In our approach, a different technique is used. Rather than 
thresholding the analog signal generated by the photosensor, we take 
advantage of the information contained in that signal to achieve an 
accurate and highly robust estimate of the position of the light pen on 
the display screen. As we will see in the following sections, using the 
right kind of processing can convert an ordinary light pen into one 
with an accuracy of one-quarter of a pixel on a display with 1024 x 
1024 resolution. 


Il. MOMENT COMPUTATION APPROACH 
If the analog signal generated by the light pen’s photosensor is 
displayed on a monitor, one sees a cometlike pattern similar to what 
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is shown in Fig. 2. The tail of the pattern is caused by the persistence 
of the phosphor used in the CRT display at which the light pen is 
pointed. A long persistence phosphor will cause a longer tail. In our 
new light pen, the centroid of this intensity pattern is used as the 
estimate of the position of the light pen. As shown in Ref. 4, this 
approach produces a highly accurate and robust estimate of the posi- 
tion. 

The x and y coordinates of the centroid are calculated in real time 
(60 fields/s) by computing various moments of the intensity pattern. 
The scheme is illustrated in Fig. 3. After proper amplification, the 





Fig. 2—Intensity pattern generated by the light pen’s photosensor. 


PHOTOSENSOR 
OUTPUT 














REAL-TIME 
MOMENT 
CALCULATION 















CENTROID 
CALCULATION 


ANALOG-TO- 
DIGITAL 


CENTROID 
DISPLAY COORDINATES 


Q 


~ 


~ PHOTOSENSOR’S 
FIELD OF VIEW 


Fig. 3—Light pen system using the moment computation approach. 
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photosensor’s output is digitized and fed to a two-dimensional digital 
filter, which can compute the intensity moments of the two-dimen- 
sional signal represented by the photosensor’s output. The moments 
m”? about the point (N, M) are defined as 


N M 
mt = YY xi, ji NYG MY, (1) 
1=0 j= 
where x(i, j) represent the digitized samples of the sensor’s output, 
and N and M are the horizontal and vertical resolution of the display 
system. We recently proposed a very efficient algorithm for real-time 
computation of (1). This algorithm is described and analyzed in Ref. 
5. It uses a set of identical single-pole digital filters and has a highly 
regular and expandable structure. In Ref. 5 we also present a Very 
Large-Scale-Integrated (VLSI) design for single-chip implementation 
of this algorithm in Complementary Metal-Oxide Semiconductor 
(CMOS) technology. The chip is capable of simultaneously computing 
16 moments m?? (p = 0, 1, 2, 3; g = 0, 1, 2, 3) of a 512 * 512, 8 
b/pixel image in real time (i.e., at conventional video rate). 

In this application for estimating the position of the light pen, only 
the three moments m°°, m®', and m?’ are used to compute the centroid 
coordinates as 
1, 


So 


m 
i= 
¢ m? 
0,1 
m ’ 
= ae: 2) 
Ye = 7 08 ( 


The subpixel accuracy in our estimation stems from the fact that the 
above division operations can be carried out in floating point mode. 
This generates results accurate to almost one digit after the decimal 
point, as we will see in Section V. 


IV. HARDWARE IMPLEMENTATION 


The light pen algorithm described above has been implemented as 
part of a larger graphics test bed described in Ref. 6. Figure 4 shows 
the light pen system used in the test bed. Except for the moment 
generator circuit, the rest of the blocks in this figure are part of the 
graphics test bed. As described earlier, the signal generated by the 
light pen’s photosensor is digitized and fed to the moment generator 
circuit, which generates a set of raw moments at the rate of 60 times 
a second (video field rate). These moments are read by the test bed’s 
control processor (a 68000 microprocessor), which calculates the cen- 
troid coordinates and feeds them to the display processor for proper 
action (e.g., writing, erasing, and brushing). 
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Fig. 4—Light pen system using the moment approach implemented on a graphics 
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Fig. 5—Digital filter representation of the moment computation algorithm. 


The moment generator circuit is basically a Transistor-Transistor 
Logic (TTL) implementation of the algorithm we described in Ref. 4. 
It consists of a number of single-pole digital filters, with transfer 
function 1/(Z — 1), interconnected as shown in Fig. 5. This is only 
part of a larger filter network described in Ref. 5, mainly because 
higher-order moments are not required in this application. Each filter 
block is simply implemented by an accumulator, which is basically an 
adder with an output register and a feedback path. This is shown in 
Fig. 6 for the two blocks in the first row of Fig. 5, referred to as a row 
filter in Ref. 5. These two blocks operate at the pixel clock rate. Figure 
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Fig. 6—Circuit and timing diagram of the row filter section of the moment generator. 


6 illustrates the operation of this simple arrangement of adders and 
registers. The rest of the blocks in Fig. 5, referred to as column filters, 
are implemented in exactly the same way, except that they run at the 
display’s scan-line clock rate rather than the much faster pixel clock 
rate. This feature relaxes the speed requirement on the adders in the 
column filter to the point where they can be implemented serially. . 
This results in considerable savings in space and power consumption. 
This feature is extremely important in the VLSI implementation of 
the moment generator circuit, as discussed in Ref. 5. 


V. RESULTS 


Results obtained from the hardware implementation of our new 
light pen aglorithm indicate that the moment computation approach 
can indeed achieve subpixel accuracy in estimating the position of the 
light pen. All the line and curve drawings presented in this section 
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were drawn by the light pen on a 1000-line raster scan black and white 
monitor that uses a short persistence phosphor (2 to 3 us). For a 
display monitor with a long persistence phosphor, the output of the 
photosensor’s amplifier should be differentiated and rectified before 
processing. See Ref. 4 for more details. 

Figure 7 shows a number of curves and lines drawn by the light pen 
using the moment computation approach. Figure 8 shows similar 
drawings using the conventional method in determining the position 
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Fig. 7—Light pen drawings using the moment computation approach. 
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(a) (b) 


(c) (d) 


Fig. 8—Light pen drawings using the conventional technique. 


of the light pen (same photosensor was used). Except for Figs. 7a and 
8a, a linear interpolation has been used for connecting the points. A 
comparison of the drawings in Figs. 7 and 8 clearly shows the degree 
of improvement achieved by the moment computation approach. Fig- 
ure 7b is of special interest. It shows that, even with a simple linear 
interpolation, light pens using this new algorithm can generate very 
high-quality writing on CRT displays. 

To obtain some quantitative measure of the accuracy, the following 
experiment was performed. The light pen was attached to a micropo- 
sitioning device facing the display screen. The micropositioner was 
moved horizontally in steps of 0.001 inch, and the floating point x and 
y coordinates generated by the moment processor were recorded. The 
results are plotted in Fig. 9. From these results we can observe an 
accuracy of one-quarter of a pixel on a 1K X 1K resolution monitor. 
The pen can resolve 0.004 of an inch on a 15-inch screen. It should be 
noted that, although the resolution of our display system is 1K x 1K, 
the signal fed to the moment generator circuit is subsampled by a 
factor of 2, resulting in an effective input resolution of about 500 x 
500. It should also be noted that part of the error in the recorded data 
for the x and y coordinates is due to the jitter in the mechanical setup 
in the above experiment. 
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Fig. 9—Position computed by the moment algorithm in pixels as a function of the 
movement of the pen by a micropositioner. 


To be of practical use in today’s graphics systems, the cost of this 
new light pen must be drastically reduced. We are currently studying 
the possibility of implementing the algorithm in a single-hybrid CMOS 
chip. This would include the Analog-to-Digital (A/D), moment gen- 
erator circuit, and postprocessing. A VLSI design for the moment 
generator part of this chip has already been prepared.° In applications 
where an accuracy of one pixel is sufficient, the requirement on the 
size of the A/D and the adder circuits in the moment generator can 
be considerably relaxed. Preliminary investigations indicate that the 
moments of the area of the pattern in Fig. 2, rather than the intensity 
moments, are sufficient for one-pixel accuracy. Using the area mo- 
ment, if possible, reduces the A/D to a simple comparator (1 b/pixel), 
and the moment generator requires much less silicon area. All these 
options are being investigated and will be reported in the future. 


VI. CONCLUSION 


A new light pen system based on computing the centroid of an 
intensity pattern generated by a photosensor was presented. The 
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hardware implementation of this system was described in detail. The 
central part of the hardware is a TTL implementation of an efficient, 
real-time, moment-generating algorithm reported earlier. The mo- 
ments are used to compute the x and y coordinates of the centroid as 
an estimate of the position of the light pen on the screen. It was shown 
that this new technique can achieve an accuracy of at least one-quarter 
of a pixel on a display screen with a resolution of 1K X 1K. The pen 
can thus resolve 0.004 inch on a 15-inch screen. This is an improve- 
ment by a factor of at least 50 over the conventional techniques used 
in today’s light pens. Such an improvement is gained at the cost of 
increasing the complexity of the hardware by a considerable factor. 
However, due to the simplicity of our moment-generating algorithm 
and the regularity of its structure, the required hardware can be 
implemented in a single CMOS chip. Such a chip can have many 
potential applications in the field of computer graphics. Work on this 
single-chip implementation of the algorithm is in progress. 
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Sojourn Time Distribution in a 
Multiprogrammed Computer System 
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We present a method for calculating the moments and the distribution of 
sojourn time in a multiprogrammed computer system. We assume that the 
CPU and I/O subsystem can be represented by a general state-dependent 
server who works according to the processor sharing discipline. Further, at 
most m jobs may be simultaneously receiving service. Thus, m is the multi- 
programming level of the system. The arrival of jobs occurs according to a 
Poisson process, and the arrivals must wait in a waiting area if m jobs are 
already receiving service. The method presented may be useful in designing 
the multiprogramming level needed to meet certain objectives on the charac- 
teristics of the sojourn time. 


I. INTRODUCTION 


This paper is concerned with finding the moments and the distri- 
bution of sojourn time in a multiprogrammed computer system. We 
assume that jobs arrive into a computer system according to a Poisson 
process at a rate of \. The computer system is divided into two areas, 
a waiting area and a service area. The service area can hold at most 
m jobs, where m is the multiprogramming level of the system. The 
jobs in the service area receive service from a server whose service rate 
is assumed to be state dependent. The server operates at a rate py; 
{using the Processor Sharing (PS) discipline] whenever there are j 
jobs in the service area with y,, = u. We assume that each job’s service 
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time is exponentially distributed, so that on every service completion, 
each customer in the service area is equally likely to leave the system. 
An arrival into the system goes directly into the service area if it is 
not full. An arriving job goes to the waiting area of unlimited size if 
there are already m jobs in the service area. The customers are drawn 
from the waiting area into the service area according to the First-In 
First-Out (FIFO) discipline. Figure 1 depicts this situation schemati- 
cally. The model may be useful in determining the multiprogramming 
level needed to meet a certain response-time characteristic. 

The model is an obvious generalization of the M/M/1 First-Come 
First-Served (FCFS) queue (m = 1) and of the M/M/1 PS queue 
(uj; = w and m = ©), A little thought should convince the reader that 
the M/M/m FCFS queue is also a special case of this model with p; = 
io for 1 = 1, --- , m, where o is the service rate of each server. Avi- 
Itzhak and Heyman had proposed the use of a state-dependent server 
to approximate the CPU and I/O subsystem.’ We depict a multiple 
CPU and disk subsystem in Fig. 2, which is approximated by a state- 
dependent server in our model. The service rate y»; of our model is 
obtained by solving for the throughput in the closed queueing network 
of Fig. 2 with a population size of 1. Reiser and Lavenberg describe a 


WAITING AREA SERVICE AREA 
pemered |) > ~o——— | ||| | —— 
nN 

CAPACITY =m 


SERVICE RATE =[t;,/=1,...,m 


Fig. 1—Model of a multiprogrammed computer. 


s CPUs | 1 : SERVICE RATE= @ 
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NOTES: 1. POPULATION SIZE = 
2. FCFS DISCIPLINE 
3. THROUGHPUT = fh;,/=1,...,m 
Fig. 2—Model of a multiple CPU and disk subsystem. 
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method of solving the throughput of such a closed queueing network.” 
Fredericks obtains approximations for mean delays in a multipro- 
grammed computer system and discusses the question of accuracy of 
the state-dependent server model.? Konheim and Reiser examine a 
computer system with one disk and one CPU, subject to a bound on 
the number of jobs present.* Mitra obtains the waiting time distribu- 
tion for a computer system fed by jobs from a finite number of 
terminals.° Salza and Lavenberg investigate hierarchical decomposi- 
tion methods for approximating response-time distributions in certain 
closed queueing network models of computer performance.® Coffman, 
Muntz, and Trotter obtain the sojourn time distribution of the 
M/M/1 queue with processor sharing discipline.’ 

In Section II of this paper, we discuss some preliminary results. 
Sections III and IV are concerned with finding the moments of sojourn 
time. In Section V, we provide a method of obtaining the distribution 
of sojourn time. In Section VI, we present some numerical examples. 


H. DEFINITIONS AND PRELIMINARIES 
Let us define the following quantities: 


N = Number in system seen by an arrival, excluding itself. 
P,= P(N =n). 
I = Number in system seen by a customer entering the service 
area, including itself. 
qi= P(I =i). 
U = Time spent in the waiting area by an arrival. 
F(u) = P(U Su). 
2n = E(U"). 
T = Time spent in the service area by a tagged customer. 
B;(s) = Laplace-Stieltjes transform of the conditional distribution 
of T given I = 1, 1.e., 


B;(s) = 7 e“dP(T < t|I =i). 
0 





aa es 


ds s=0 
V = sojourn time, ie., V= U+ T. 


Xin = E(T"|I = 1) = (-1)" 





By solving the equation of a birth and death process, one obtains the 
following results directly. 
Let 
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Then 





m-1 -1 
Pali Sy Wn + a] A 





n=1 Lp 
where 
p = A/um = d/u 
and 
epee if n=1,---,m 
‘i pe "Pm if n=m+i.,. 
Let L be the mean number in the system. Then 
at P.,, mP 

L=P. Yrs + waned, 

Let W be the mean time spent in the system. Then, from Little’s law, 
W=L/). 


Ill. CHARACTERIZATION OF U, I, AND V 


In this section, we will derive the distribution and moments of U, 
and the distribution of J, and we will characterize the moments of V. 
It is clear that 


0 if Nem-l 
U = > sum of (N — m + 1) independent exponentials each 
of rate » if N =m. 


Then, 


m-1 oO H —nt k-m 
_ ye (ut) 
Fw =D P+ Df (em dt for u=0 


and 


ee eee “we (uu) ™ 
2 = BU -{ wdPw = 3 Ps | u (k- m)i du. 


This can be shown to be equal to 
P,,n' 
(1 = p)\(u — a)" 
This result could also have been obtained as follows: 
Zn = EU" = E(U"|NZ=m)P(N=™m) + E(U"|N< m)P(N < m). 


Given that a customer waits (i.e., N = m), the waiting room behaves 
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like an M/M/1 FCFS queue. Thus, the wait time corresponds to the 
sojourn time of an M/M/1 queue and is exponential with parameter 
(u = d). So, 
_ n! Pz 

(ae ae ae 

We will now derive the distribution of J. Consider first the case 
where [ > m: 


Zn 


nl P(I = i|U = u)dF(u) 


O+ 


2 © —hu i-m pu k-m 
a 5 eu) wet (uu) 
0 


wom (i-m)! * (kR—m)! 


After some algebra one can show that 


du. 


qg= P; for i>m. 


This can be explained by the fact that an arrival to and a departure 
from the waiting area see the same distribution of the number in the 
waiting area and that the number in the service area is constant, given 
that N > m. For I < m, the number in the system including the arrival 
is one more than the number seen by the arrival. So 


q=P;-, for i=1,---,m-—1. 


For I = m, it is possible for an arrival into the service area to see m 
customers including itself in one of two possible ways. First, there 
were m — 1 customers in the system and an arrival occurred. Second, 
the arrival saw at least m customers and no one arrived during its 
wait. So, 


Qm = Pn-i +f P(I = m|U = u)dF(u) = Ph-i + Pn 
0 


+ 


by the arguments used above. Thus, 


Pe if t~=1,---,m-—1 
P; if 1>m. 


We now characterize the moments of the sojourn time in terms of 
Xin. In the next section, we will show how to calculate x;n. 
2 n! ' ; 
EV"? =) Gi at E(T’U"”) 
joj inj) 


and 
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E(T/U"”) = { u" I E(T!|U = u)dF(u) 


= y i u"IE(T’|I = i)P(U = i| U = u)dF(u). 
i=1 


The last step follows from the fact that T is independent of U given 
I. We now consider three cases: 
1. Forj =n, 


ET” = > Xin i- 
i=1 
2. For 0 <j <n, the expression for E(T/U"”) is 


oo o ; e™(Au)-™ 
dX ——_— —— dF (a), 
BJ ria OT art 
which, after some algebra, reduces to 


foe) 


((+n-—-j-—m)! Xij 





Pn - : i+tn—j—m 
x (1 — m)! Na 
3. For j = 0, but n ¥ 0, the expression is simply 
P,n' 
EW = z, = ——————_.. 
(1 — p)(u — A)” 
So, 
Y xngi if j=n 
i=1 
Bipot Ph w (itn-j-m)! piesa. ~ 
i J VFy= - Se irn—-j—M <j< 
E(T!U"*) ce Zz Gomi if 0<j<n 
Pn! oe 
—_——_——— if j=0, butn#0. 
(l—pu-r» > 7 


IV. CALCULATION OF x;, 

In this section, we show how to calculate x;,. The transforms B;(s) 
satisfy the following equations: 

A+ pit 


B; = 
+1(S) ; Pe 





A Mi+1 
{5 Fay Beals) + Po (cn Bits) + (1 ca) | (1) 
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fori=1,2,--- and 





A+ wy r M1 . 
B = | —__——__ | ——_- B + ‘ 2 
i(s) fate 2(s) a (2) 
where 
t—-1 if t<m 
_ l dh if i<m 
i) m-1 if i=m and pi \ if i1=>m. 


m 


These equations are obtained by assuming that the system is in state 
i + 1 in steady state and conditioning on the time of the next event. 
An event is defined to be a service completion or an arrival, whichever 
occurs first. By rearrangement, 


A + pit + 8 


Bi+2(s) — X 


i41 Ci i 
Bisi(s) + a Bis) = - ao (S654). 


Taking the nth derivative, multiplying by (—1)”, and setting s = 0, we 
get 


A + Mit Hit1Ci+1 n 
Xi+2,.n — ae Xi+1,n a3 oar Xin = x Xi+1,n-1 (3) 
fori=1,n=1,2,---,and 
A+ My n 
Xan — Fan = = J Manas (4) 
where 
Xio = 1, t= 1. (5) 


Equation (3) is a second-order partial difference equation with variable 
coefficients. However, we note that for i = m, the coefficients do not 
vary with 1. So we will first show a method of solving an equivalent 
system with coefficients that do not vary with i and then present a 
procedure for the solution of eq. (3). The interested reader is referred 
to Boole® or Jagerman’ for details of the techniques used to solve this 
difference equation. Consider the equation 

Vi+2,n — “+ Yi+1,n = Yin = + Yin (6) 
for positive integer valued i and n, where c = c,,. For the homogeneous 
version of this equation, the solution is 


Yin = Ano} a B,o}, 
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where og, and oz are the roots of 





@-~THa yuo, 
i.e., 
| tnt SET — 2) 
2d 
and 
Dae EE Ml = BO 


2X 


It can be shown that o, > 1 > oo. The constants A, and B, will be 
evaluated later. To find a particular solution to (6), we define the 
translation operator FE and the difference operator A such that 


Eu; = ui+1 
and 
Au; = Uj. — Uj; = (E — 1). 
We can then rewrite (6) as 
(E — 01)(E = 02)¥in = — + Yiean-t- 


The solution is 


eee eae ee Semen | pene oo 
ae ee E- 3, E— oa h vithet . 


In order to solve this, we must evaluate (E — co) (- a) 








Now, 


(E a a) (- ; Jere] = (E are a) ta a) (- ° See] 


git (cE = a) tg FY (- ® seam) . 


The last step is obtained from the shift formula (see footnote on page 
73 of Boole®): 
f(E)(a'u;) = a'f (aE)u. 


The expression now is 
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o[e(E — 1)ptao @P (- 2 Sina) 


= giA7! jer (- ” Sina) 


We cox 
Cains > a hase 
A j=l 


So a particular solution is 


_ = ee —_ i-j “Hd. 
Yin -(— = =) 2 (or 02 )Yjun 1 


The general solution is 


n F 

Yin = -(—*—)5 2 (oi? = o57)¥jn-1 + Angi + Bn a} (7) 
A(o1 — a2) 

for positive integer valued i and n. It is easy to verify that this satisfies 

eq. (6). 


4.1 Calculation of A, 


We first state that y;, has a probabilistic interpretation. We replace 
the original system by a new one in which the service rate of the server 
is n, regardless of the number of customers in the service area. Further, 
consider a tagged customer whose probability of leaving the system is 
1/m at each service completion, whenever there are at least two 
customers in the service area. Then eq. (6) describes the behavior of 
the conditional moments of the time spent in the service area by the 
tagged customer in this new system for 1 = 2. Clearly, as i tends to 
infinity, the conditional moments of the original and new systems are 
identical. Further, as i tends to infinity, the server is going to operate 
at rate » for a long time in the original system. If we now tag a 
customer in the service area, it will leave the system with probability 
1/m and, therefore, T'is exponential with rate u/m as i tends to infinity 
in the original system. So 


lim Yin = lim xin = n!(m/p)”. 
iI— 00 I-00 


If we divide eq. (7) by oj and let i tend to infinity, we have 


n 


An = Moy az 62) = y O71 IY pete 


Thus, 
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Yin = a ( »» OT Yin-1 + >; oF sin] +. B,o%. (8) 
A(o1 — 2) \j=it1 j=l 
At this point, it is possible to verify that this result indeed yields the 
familiar answer for the M/M/1 FCFS queue. To do this, one has to 
set m = 1. This yields c = 0 and oz = 0. After some algebra, one obtains 
Vin = n!/p" for 1 = 1 as expected. 


4.2 Calculation of B, 


We have so far been able to find y,, up to a constant, and a 
comparison of (3) and (6) should convince the reader that y,,, obtained 
from (8) satisfies (3) for 1 = m. Thus, after finding x;, for 1 = m, we 
can recursively compute Xm—in; °** » Xin by using (8). It is clear that 
each of these variables is a linear function of the unknown constant 
B,,. Finally, we will find B,, to satisfy (4). This is done by choosing two 
trial values of B,, evaluating two sets of x;,, and using linear interpo- 
lation to satisfy (4). The formal procedure to do this is as follows: 

1. Set n = 1. Set xio = yio = 1 for alli = 1. 

2. Select two trial values of B,, say B} and B?. 

3. For each B® (k = 1, 2), use (8) to obtain y‘, for i= 1 and k = 1, 
2. In this step, the terms ¥,,,-: on the right-hand side of (8) should not 
be indexed by Rk. 

4. Set xt, = yi, fori= mandk=1, 2. 


5. Use (3) recursively to compute x*,1,, +--+, xf, for k = 1, 2. In 
this step, the terms x;4;,,-1 on the right-hand side of (3) should not be 
indexed by k. 


6. From x*, (k = 1, 2) evaluate the left-hand side of (4). Let L* be 
the left-hand side corresponding to B® for k = 1, 2. 
7. Since x;, is a linear function of B,, for all 1, we have 


Xin = Xin or (1 = )Xins 


and 


Yn = YY, + (1 -— y) Yi, for all positive i, 


where 


Ba (- Xin 14 )/(0h — Ty). 


Note that y does not have to be between zero and one. 
8. Set n =n +1 and go to step 2. 
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V. THE DISTRIBUTION OF SOJOURN TIME 


In this section, we provide a direct method of obtaining the distri- 
bution of sojourn time. First, we observe that U and T are independent 
random variables conditional on J. Thus, 


P(V<t)=J [P(U st|I =1)*P(T s t|I =i) ]q, (9) 


where * is the convolution operator. It is easy to see that 


P(U st|I =i) 


pu i] U = 0)P(U = 0) al PUI =i|U= wae 


> 


qi 
from which we obtain, after some routine algebra (for t = 0), 
1 if 1<m-1 
P(U st|I=1)= L= Pe / dn if t=m 


u 


* (wu)™ pe 
o (i-—m)! 
If we let V(s) and U;(s) be the Laplace-Stieltjes transforms of 
P(V st) and P(U s t|J = 1), respectively, then from (9) we have 


du if i>m. (10) 


V(s) = > B;(s) Ui(s)qi- (11) 


It is possible to use the method of Section IV to solve eq. (1) for B;(s) 
directly. In fact, for 1 = m, B;(s) has the form 


LS CTT OY (12) 


ie u/(m + s) 


where 


[At up ts—v(X\+u +s)? — 4rue] 


d2(s) = Or 


After finding B;(s) for 1 = m, we can recursively compute B,-1(s), 
--- , By (s) by using (1). It is clear that all of these quantities depend 
on the unknown function B(s). Finally, one can find B(s) to satisfy 
(2). This would have to be done by choosing two trial values of B(s), 
evaluating two sets of B;(s), and using linear interpolation to satisfy 
(2). It is possible to shown that V(s) simplifies [by using eqs. (10), 
(11), and (12)] to 
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ut (uQm+ SP m-1) 
x B;(s)qi + Bn(s) ar rer ae 





we ( w, __Bls)oF**(s) 
wet+s\s(ut+sm) st pl — o2(s)]/)° 


To find P( V < t), one would have to invert this transform numerically. 

In the remainder of this section, we will show a method of calculat- 
ing P(T <s t|J = 1) directly. Let us assume that at all times, events 
occur in this system according to a Poisson process at a rate of \ + 
Max( 1, 2, °** , Lm). Let a be the maximum value of the arguments. 
Whenever the system is in state 1, arrivals occur with probability \/(A 
+ a), departures occur with probability u;/(\ + a), and with probability 
(a — wi)/(A + a) the system does not change its state. We assume that 
Ho = 0. Let v;, be the probability that a tagged customer departs on 
the kth event given that J = i. Then, v;, can be recursively computed 
from 








Cipi a — pi ; 
Rao. beige Vi-1,k-1 + vie for 12=1,k=2 
Vik aa ee Joanne 1,k-1 A icary k-1 
1 — ¢;)p; é 
if os ES 5 ihe 
(A + a) 
and 


York =O for R21. 


Since events are occurring at a Poisson rate of \ + a, departure of the 
tagged customer at the kth event means that its time in the service 
area is a Gamma random variable of order k and parameter \ + a. 
Thus, 


spr iene | Oe ae Oa = 
P(T s t|I=i) = % ek pa 


It is now possible to use (10), (11), and (13) to show that V(s) is given 
by 





y Mee 
k=1 A tats 
m-1 ( oo i-m+1 
HQm + SPm-1) fad 
: iRQi HK Um. A Sit i : 
5 ec ea) 3a(4) | 


Vi. NUMERICAL RESULTS 

In Table I we present numerical results for four examples, obtained 
by using the methods described earlier. In all examples, p is 0.9 or 0.95 
and yu is 1. The first example is for the M/M/1 FCFS queue and the 
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Table I[—Numerical results 










EV‘ 
240,000 






; [oso | saa [x00 | 000_| 6006] 
20.0 47,995 | 3,838,494 


1 
[200__| a 


of | 
14.388 | 14.388 | 380.17 | 14,260| 686,744 
ee Ce ed 


second for the M/M/5 FCFS queue. We approximate the M/M/1 PS 
queue in the third example by choosing a high value (50) for m. The 
fourth example is likely to be typical for a multiprogrammed computer 
system where y; is increasing in i but at a diminishing rate. We get a 
good correspondence between W (obtained from Little’s formula) and 
EV in all cases. It is possible to verify that the higher moments in the 
first two examples are very accurate. 

The computational procedure described earlier works well for up to 
fairly high values of m (about 100). However, for very large values of 
m (say 200), the method is prone to numerical difficulties. We feel 
that this happens because m determines the number of recursions and 
this, in turn, determines the extent to which errors are compounded. 
However, in real-life applications, one may not encounter values of m 
greater than 40, which makes our method suitable for these applica- 
tions. 


FCFS 













General 
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Application of Decomposition Principle in M/G/1 
Vacation Model to Two Continuum Cyclic 
Queueing Models—Especially Token-Ring LANs 
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We apply a recent decomposition result of Fuhrmann and Cooper for the 
M/G/1 queue with server vacations to obtain mean waiting times for the 
following two cyclic queueing models: The server scans at a constant velocity 
(1) serving work as it is encountered, or (2) collecting work that it serves at 
the end of each cycle. Model 1 describes token-ring polling in certain computer- 
communication networks; Model 2 has been used to describe mail pickup and 
delivery systems. 


I. INTRODUCTION AND SUMMARY 


Cyclic queueing models, in which a single server switches back and 
forth among a (large) number n of queues, have been studied by many 
authors. These studies were motivated largely by the need to describe 
the performance of electronic telephone-switching systems. Recent 
technological developments in computer-communication networks (lo- 
cal area networks, or LANs) have generated renewed interest in these 
models. In the present paper we consider two continuum cyclic 
queueing models, i.e., models where n — © while the total arrival rate 
remains fixed. Model 1 describes the behavior of certain token-ring 
LANs, while Model 2 has been used to describe mail pickup and 
delivery systems. 
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Exact analytic models for finite n tend to be very complicated (see, 
for example, Refs. 1 through 8). One of the earliest techniques for the 
analysis of these complicated multiqueue models was to define the 
ordinary single-server vacation model, in which the server periodically 
leaves the queue and takes a “vacation”; this vacation model is then 
“connected” to the model of n queues served in cyclic order by 
interpreting the vacation as the time interval from when the server 
leaves a particular queue until its return to that queue after cycling 
through the other n — 1 queues.” 

The basis for the analysis in the present paper is to relate the server 
vacations to the cyclic queueing model in an entirely different manner, 
and then to apply a new stochastic decomposition result of Fuhrmann 
and Cooper? for the M/G/1 vacation model. (As noted, in the present 
paper we are interested in models where n is infinite. Another paper’® 
uses a related method to analyze certain cyclic queueing models where 
n is finite.) 

In both of our continuum models, the server scans (or polls) at a 
constant velocity along a closed path. Customers arrive according to a 
Poisson process (with rate \) in time, and are uniformly and inde- 
pendently distributed in space along the scanning path. Service times 
have distribution function H(-), with mean 7 and variance o”, and are 
independent of the arrival process and each other. In Model 1, the 
server stops scanning and serves customers as they are encountered 
along the scanning path. In Model 2, the server collects customers 
(there is no time expended in collecting a customer) as they are 
encountered along the scanning path; when the server reaches a unique 
point on the path called the origin, the server stops and serves all the 
customers it has collected since last leaving the origin. Model 2 is 
closely related to a model studied by Nahmias and Rothkopf,"' which 
we shall refer to as Model 3, in which customers are served at the 
origin and are then randomly (uniformly) redistributed (delivered) 
over the scanning path on the server’s next cycle. This is in contrast 
to Models 1 and 2, where each customer departs from the system as 
soon as his service is completed. 

Model 1 provides a good description of a large, symmetric, token- 
ring LAN. In such a network, a number n of terminals (devices, work 
stations) are interconnected in either a physical or logical ring struc- 
ture. The terminals’ access to the transmission medium is controlled 
by a “token” (a signal) that circulates around the ring. A terminal 
gains access to the medium by seizing the circulating token as it goes 
by. It retains the token while it is transmitting, thereby preventing 
other terminals from simultaneously accessing the medium; it then 
releases the token to circulate around the ring, enabling another 
terminal to gain access to the transmission medium. (For a general 
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description of LANs and token-passing protocols, see, for example, 
Refs. 12 through 14.) 

One identifies the scan time c as the time required for the server to 
poll all the terminals once (equivalently, the time required for the 
token to cycle once around the LAN ring when no terminals are 
waiting to transmit). If the number n of terminals is large, and the 
terminals submit statistically identical loads (i.e., each terminal is 
characterized by the same arrival rate and distribution of service 
times), then there is a good correspondence between Model 1 and the 
LAN. 

Model 3 has been studied by Nahmias and Rothkopf,"' who used it 
to describe a delivery system in which a clerk (the server) traverses 
(scans) at a constant velocity a route along which letters (customers) 
are generated (arrive) randomly in space and time. As the clerk travels 
along the route, he picks up the letters that have been generated since 
his last traversal, and he delivers the letters (to locations distributed 
uniformly along the route) that were previously picked up and sorted. 
When the clerk reaches the end of the route he sorts (serves) the 
letters he has just picked up; then he again traverses the route, 
delivering the letters that have just been sorted, and picking up the 
new letters that have been generated since his last traversal of the 
route. This process is repeated indefinitely. 

For each model, the equilibrium cycle time T;, defined as the time 
(during equilibrium) between successive visits in Model j by the server 
to any given point along the scanning path, has the same mean value 
T = E(T;), given by 


jes 





i (0 < 1), (1) 
—p 

where c is the (constant) length of time the server spends scanning 
during each cycle (which is the time to complete a scan cycle when 
there is no work to be done), and p(=Ar) is the server utilization. 
[Equation (1) is easily derived by the following argument, given by 
Kuehn,’ for a very general model of n queues served in cyclic order 
by a single server: The mean cycle time T is the sum of the (constant) 
time c spent scanning and the mean time s spent serving per cycle; 
that is, T = c +s. Clearly, s = pT, and (1) follows. Note that T does 
not depend on the form of the service-time distribution function, 
but only on its mean value; also, the parameter n does not appear 
explicitly. ] 

Our main results are these: Let W;(j = 1, 2) be the equilibrium 
waiting time (time from request for service until start of service) in 
Model j, and let Wo be the equilibrium waiting time in the correspond- 
ing M/G/1 queue (Model 0). Then, 


CONTINUUM CYCLIC QUEUEING = 1093 


E(W,) = 5T + E( Wo), (2) 


1 
2 
and 

E(W,) = T + E(W)), | (3) 


where T is given by (1), and E( Wo) is given by the celebrated 
Pollaczek-Khintchine formula [see, for example, Ref. 15, p. 217, eq. 
(8.39) ]: 


2 


eee a <= 
rims = (+2) , 


The simplicity of (2) and (3) and their similarity are quite remarkable. 
We also define S;(j = 0, 1, 2) to be the equilibrium sojourn time 
(waiting time plus service time) in Model j. Since 


eqs. (2) and (3) are equivalent to the following two equations: 


B(S:) = 57 + B(So) (5) 
and 
E(S,) = T + E(Sv). (6) 


For Model 3, we define the equilibrium delivery time D3 as the 
elapsed time between the generation of a letter and its delivery to its 
destination. We will show that E'(D3) is given by the following formula, 
again remarkable in its simplicity: 


Bee Le Dp 
E(D3)=1+=T + 
(D3) T 9 eo 





E(W,). (7) 


The special case of (7) when o” = 0 (i.e., when the time required to 
sort a letter is constant) was found (in a different form, by a more 
complicated argument) by Nahmias and Rothkopf."! The general result 
(7) was found also by Shanthikumar,”*® using level-crossing analysis. 

The well-known textbook by Tanenbaum” discusses a model of a 
token-ring LAN with an arbitrary number n of terminals that, for n 
infinite, coincides with our Model 1. Tanenbaum gives eq. (1) and 
then states (p. 310) that the mean waiting time is “about half” 
the mean cycle time. It is interesting to note that, for the case 
of n infinite, eq. (2) shows that Tanenbaum’s approximation (i.e., 
E(W,) = T/2) underestimates the correct value by exactly E( Wo), an 
amount that can be considerable, being essentially proportional to o” 
and inversely proportional to 1 — p. [The reason that 7/2 underesti- 
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mates the correct value is a manifestation of the phenomenon of 
length biasing, i.e., the cycle to which an arbitrary customer (the test 
customer) arrives is stochastically longer than an arbitrary cycle. In 
particular, if Tf is the length of the cycle during which the test 
customer arrives, then F(T?) = E(T,) + V(T,)/E(T;), where V(7T1) 
is the variance of the cycle times in Model 1 (see, for example, Ref. 
15, pp. 200-6). Since E( W,) = E(T?)/2, it follows from this observa- 
tion and eq. (2) that V(T,) = 2TE(W,). A practical implication of 
this observation is that the mean waiting time E( W;) can be estimated 
using measurements of cycle times only.] 

Several authors (see Refs. 4, 17, 18, and 19), using arguments more 
complicated than ours, have obtained results for related models with 
different queue disciplines (e.g., exhaustive service or gated service) 
and a finite number n of terminals; our result (2) can be obtained from 
their results when n — «. Coffman and Gilbert” have analyzed Model 
1 for the case of constant service times and, for this special case, 
derived a number of explicit distributional results, such as the distri- 
bution of waiting times. 

In Section II we state the M/G/1 decomposition result alluded 
to earlier. In Section III we apply this decomposition result to obtain 
eqs. (2), (3), (5), and (6). For completeness, in Section IV we quickly 
derive eq. (7) by directly comparing the mean delays in Models 2 
and 3. 


Il. A SFOCHASTIC DECOMPOSITION RESULT 


At all times the server is either scanning or is serving customers. 
The basis for the analysis of this paper is to interpret the time intervals 
when the server is scanning to be vacations, and then to invoke 
Proposition 3 of Fuhrmann and Cooper.’ We define 


y;(-) = the p.g.f. (probability generating function) of the equilibrium 
distribution of the number of the customers present in Model 
J(j = 1, 2) at an arbitrary point in time; 

x;(-) = the p.g.f. of the equilibrium distribution of the number of 
customers present in Model j(j = 1, 2) at an arbitrary point 
in time, given that the server is scanning (on vacation); and 

a(-) = the p.g.f. of the equilibrium distribution of the number of 
customers present in the corresponding M/G/1 queue at an 
arbitrary point in time (or, equivalently, just after a service 
completion epoch). 


Thus, z(-) is given by a well-known formula [see, for example, eq. 
(8.12), p. 210, Ref. 15]. It follows directly from Proposition 3 of 
Fuhrmann and Cooper? that 


¥j(z) = xj(z)ar(z) (y= 1, 2). (8) 
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In terms of mean values, 
Wid) =x7G) +7711) (J = 1, 2). (9) 


In Section III we show that it is a simple matter to find x/(1) for j = 
1, 2. Since (by Little’s theorem) 7’(1) = XE(So) and y¥/(1) = AE(S;), 
eq. (9) yields eqs. (5) and (6) or, equivalently, (2) and (3). 


Hl. MEAN WAITING TIMES: MODELS 1 AND 2 


In this section we derive formulas (2) and (8) for the mean waiting 
times for Models 1 and 2. To do this, first note that (for either model) 
during each customer’s service time, k new customers arrive to the 
system with probability 


— f° att 
ae oR! 


We now define two auxiliary models, Auxiliary Model 1 and Auxil- 
iary Model 2. Auxiliary Model 1 is defined in exactly in the same way 
as Model 1 except for the following aspect: Now, whenever the server 
encounters a customer, the customer is served in zero time and departs 
from the system; coincidental with his departure, however, a batch of 
k new customers joins the system (distributed along the scanning path 
in a uniform and independent manner) with probability p,, given by 
(10). Thus, while the lengths of all service times have been collapsed 
to zero, the number of customers in the system just after a service 
completion epoch is stochastically the same for both Model 1 and 
Auxiliary Model 1. (This is true in a distributional sense. Or, if we go 
to the trouble to define Model 1 and Auxiliary Model 1 on the same 
sample space, we can make this statement true on every sample path.) 
This leads to the following conclusion: If we define A, to be the mean 
number of customers present in Auxiliary Model 1, then 


xi(1) = Ai. (11) 


e'dH(t) (k = 0, 1, 2, ---). (10) 


To calculate Aj, let Af and Sf be the arrival rate and sojourn time 
in Auxiliary Model 1. Then, by Little’s theorem, 


A; = ME(S)), (12) 


where, clearly, 


E(S}) = (13) 


i 
5° 
To calculate Aj, let N be the total number of customers (including 


himself) generated by a Poisson arrival in Auxiliary Model 1; then 
Ai = AXE(N). Now observe that the average number of new customers 
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generated when a customer is served is Ar = p; and each of these 
customers will generate, on average, E(N) additional customers. 
Hence, E(N) = 1+ pE(N); that is, E(N) = (1 — p)7', and therefore 
we. (14) 
1—p 
[Note that E(N) is precisely the mean number of customers served 
during an M/G/1 busy period. Observe also that (14) follows imme- 
diately from the requirement that the mean number of arrivals per 
cycle be the same in Auxiliary Model 1 as in the original Model 1: 
Nic = AT.] Equations /11) through (14) yield y{(1) = Ac/2(1 — p); in 
light of (1), we have 
\T 
5 
This completes the calculation of (9) for Model 1, from which the 
main result (2) follows. 

We now define Auxiliary Model 2 in a completely analogous manner, 
that is, in exactly the same way as Model 2, except that now when a 
customer is served (at the origin), he is served in zero time and is 
instantaneously replaced by a batch of k customers with probability 


Pr, given by (10). We define Az to be the mean number of customers 
present in Auxiliary Model 2. By the same argument used earlier, 


x2(1) = Ag; (16) 


xi(1) = (15) 


and the same argument that was used to derive eq. (15) for Model 1 
applies. Hence, combining the equations that are analogous to (12) 
and (14), we have 
r 
Az = —— E(S9). (17) 
Lp 

But, in contrast with (13), the mean sojourn time of a customer in 
Auxiliary Model 2 is exactly the cycle time c, 


E(S3) = c. (18) 
Therefore, the analogue of (15) is 
x3(1) = AT, (19) 


and the main result (3) follows. 


IV. MEAN DELIVERY TIME: MODEL 3 


For completeness, we now derive eq. (7). This is accomplished by 
directly comparing the mean delays in Models 2 and 3. Recall that in 
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these models, all customers are served at the origin. For either model, 
consider an arbitrary customer (the test customer) and define 


y = the mean number of customers in the test customer’s batch; 
yp = the mean number of customers in the test customer’s batch that 
are served before the test customer; and 
Yq = the mean number of customers in the test customer’s batch that 
are served after the test customer. 
Then 
Y=M+tyat1 (20) 
and, by symmetry, 
Yo = Ya: (21) 


Now let L. denote the number of customers present in Model 2, 
excluding the test customer, when the test customer enters service. 
Then 


Ac 
E(L2) = rs ee es Bs (22) 


The term \c/2 equals the mean number of customers who arrived 
during the last scan, but behind the server. (These customers will be 
collected on the server’s next cycle.) The term y,(= 57) equals the 
mean number of customers who arrived during the service times of 
the customers (in the test customer’s batch) who were served before 
the test customer. Finally, the term y, equals the mean number of 
customers that have not yet been served. 

Now observe, on the other hand, that Lz has the same distribution 
as the number of customers present in Model 2 at an arbitrary point 
in time, excluding the customer being served (if any). [This is true 
because departures see the same distribution of customers that arrivals 
see (see Ref. 15, p. 187) and the arrivals see time averages (see Ref. 
21).] Hence, by Little’s theorem, 


E(L2) = \E(W,2). (23) 
Combining (21), (22), and (23) yields 


NEW) = + yall + 0). (24) 


Equations (3) and (24) determine y,. Now observe that, clearly, 
£ 
3 


Equations (3), (24), and (25) now yield eq. (7) after some straightfor- 
ward algebra. 


E(Ds3) = E( We) + 7 + yaT + (25) 
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PLACE 2.0—An Interactive Program for PLL 
Analysis and Design 


By J. H. SAUNDERS* 
(Manuscript received January 14, 1985) 


Phase Locked Loop Analysis and Circuit Emulation (PLACE) is an inter- 
active program to assist in the design of Phase Locked Loop (PLL) systems. 
Written in C language, PLACE computes the loop filter components (up to 
active third-order filters); performs a stability analysis (finding the phase 
margin and damping ratio); and calculates the lockup time, hold range, and 
capture range. PLACE computes the PLL output jitter response due to 
incoming reference signal jitter, output jitter due to reference leakage through 
the phase detector, and output jitter response due to phase noise of the PLL 
components. The open and closed loop gain and phase, as well as jitter 
response, is plotted. Additional features found in PLACE 2.0 are frequency 
and magnitude of jitter peaking; a sensitivity analysis, which computes changes 
in loop performance as a function of component variation; and an interactive 
routine to help the designer optimize PLL performance. Currently residing on 
Digital Equipment Corporation’s VAX-11/780, AT&T 3B20, and IBM Sys- 
tem/370 processors, PLACE is presently available at most AT&T Bell Labo- 
ratories locations. This paper illustrates the capabilities of PLACE, shows 
several examples, and discusses the required calculations. 


I. INTRODUCTION 


Phase Locked Loops (PLLs) are used extensively in communication 
systems. Yet there has been no generally available computer-aided 
design program specifically targeted to assist the PLL designer in 
evaluating the performance of his system before building it in the 
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laboratory. Phase Locked Loop Analysis and Circuit Emulation 
(PLACE) is an interactive program that allows the designer to opti- 
mize his PLL, trading off one parameter (e.g., lockup time) for another 
(e.g., jitter response). PLACE performs the following functions: 

1. Determines if the PLL is stable by computing the phase margin, 
damping ratio, and undamped natural frequency. PLACE accounts for 
any parasitic Voltage Controlled Oscillator (VCO) or operational 
amplifier poles, and phase shift due to the feedback counter. 

2. Finds the appropriate loop filter components for a given damping 
ratio or 3-dB frequency. PLACE notifies the user if desired loop 
parameters result in unrealistic component values. 

3. Computes the PLL system bandwidth, noise bandwidth, and loop 
filter bandwidth. 

4. Approximates the hold range, capture range, and lockup time. 
(The hold range is normalized for the active loop filter case.) 

5. Determines the output jitter response due to the internal phase 
noise of the PLL components and the output jitter response due to 
jitter on the incoming reference signal. 

6. Determines the sensitivity of PLL performance to component 
variation. 

7. Plots open and closed loop gains,.and phase noise response. 

Section II of this paper discusses PLACE input/output, Section III 
shows several examples, and Section IV explains the calculations that 
PLACE performs. 


Il. PLACE 2.0 INPUT/OUTPUT 
2.1 PLACE 2.0 user input 


PLACE consists of a nongraphics module followed by a graphics 
module. The two modules are independent, and thus a user can invoke 
PLACE on a nongraphics terminal. PLACE recognizes the PLL shown 
in Fig. 1, where we have defined the following constants:* 

K, = phase detector gain constant (V/rad) 

Ky VCO gain constant (Hz/V) 

Nrp = feedback counter divisor 

Nryr = feed-forward counter divisor. 

The program will ask the user for the values of these constants. 
Typical values for K, are 1.4 V/rad for an exclusive-or gate, 0.4 V/rad 
for the RCA 4046 phase comparator II, 0.11 V/rad for the Motorola 
4044 phase detector, and 0.16 V/rad for the Motorola 12040. 

For best results, the VCO gain constant should be measured because 
the PLL stability is highly dependent on Ky. The value of Ky will 


II 


* Variables used in this paper are defined in Appendix A. 
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Fig. 1—PLL model depicting the required input parameters for PLACE. 


most likely vary over the VCO’s frequency range; the sensitivity 
analysis will compute the effect of this variation. If the results are 
unacceptable, the user will need to build a linearizer in order to 
maintain a constant Ky. (See Ref. 1 for an excellent example, or Ref. 
2.) If a passive attenuator is used ahead of the VCO, the entered value 
of Ky must be reduced by the attenuation factor. For frequency 
synthesizer applications, the feedback divisor Nyg is often a variable, 
and running PLACE twice with the maximum and minimum values 
will show the change in loop performance. 

Next, PLACE asks the user if it is desired to account for any 
parasitic VCO pole; if a Voltage Controlled Crystal Oscillator (VCXO) 
is used, it is strongly recommended that this pole be entered since it 
is often the dominant pole of the PLL. (See Section 4.1.4 on how to 
measure the VCO pole.) For the active loop filter case, if it is desired 
to account for the op amp pole, the designer may lump this pole into 
the VCO pole and enter the value at this time. 

Next, PLACE asks for the reference frequency, i.e., the frequency 
entering the phase detector. The reference frequency is the frequency 
at which the phase comparison is performed. This information is 
required for the jitter analysis and implicitly defines the output fre- 
quency of the PLL. 

PLACE will then ask the user for the type of loop filter desired. 
Five types of loop filters are recognized (see Fig. 2): 


1. No loop filter, where 
F(s) =1. 
2. Resistor Capacitor (RC) loop filter, where 


1 
1+ 7s 





F(s) = , 


where 7 = RC. 
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g. 2—Loop filter topologies: (a) No loop filter, (b) RC loop filter, (c) lag-lead loop 
rived (d) second-order active filter, and (e) third-order active filter. 


3. Lag-lead loop filter, where 


1 + Tos 


F = 
(s) 1+ 7s’ 


where 1 (R, + R.)C, 2 R2C. 
4, Second-order active filter, where 


1 + Tes 
? 


T1S 


F(s) = 


where 7; = R,C, 72 = RC. 
5. Third-order active filter, where 


1+ T28 
718(1 + 738)’ 


where N= R,Ci, 25 RAC, + C2), 32> RoCo. 


F(s) = 


We now list several guidelines for selecting a loop filter. The choice 
between an active or passive loop filter must be made first. If the 
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designer chooses a passive loop filter, the lag lead is preferred over the 
RC loop filter for the following reasons: 

1. For the lag-lead loop filter, the designer can specify the loop 
natural frequency w, and damping ¢ independently of each other, and 
thus it is possible to have a narrowband loop with a substantial 
damping. However, for the RC loop filter, a narrowband PLL (low w,,) 
requires a high value of 7 (loop filter time constant), but a high 7 
results in low damping and possible instability (see Section IV). 

2. The VCO parasitic pole is less likely to render a lag-lead PLL 
unstable compared with the RC loop filter PLL. (This is because the 
open loop phase for the RC loop filter PLL approaches —180 degrees 
at high frequencies, whereas the phase approaches —90 degrees for the 
lag-lead case.) 

3. The lag-lead PLL exhibits improved transient behavior. For a 
given PLL damping value, the phase error transient of the lag-lead 
PLL reaches its steady-state value much sooner than that of the RC 
loop filter. 

4, All of these improvements are obtained for the cost of one 
resistor. 

An active loop filter is required when a higher loop gain is required 
(to increase the hold range, for example). The advantages of the third- 
order active filter over the second-order active filter are improved 
response to phase-step changes, zero steady-state phase error due to 
frequency ramp inputs, and better reduction of VCO noise. 

After selecting a loop filter type, the user has two options: the 
desired damping and natural frequency may be specified, or the loop 
filter time constants may be specified. The first option is usually used 
when designing a new PLL, while the second option is normally used 
when analyzing an existing PLL. 


2.2 PLACE 2.0 output 


The user can request a stability analysis, a loop filter analysis, a 
tracking analysis, a jitter analysis, a sensitivity analysis, and an 
interactive optimization routine. 

The stability analysis yields (1) second-order undamped natural 
frequency w, in hertz, (2) phase margin in degrees, (3) phase margin 
degradation due to the VCO (and/or op amp) pole, (4) phase margin 
degradation due to any phase delay of the feedback counter, (5) 
damping, and (6) PLL system bandwidth in hertz. 

The loop filter analysis yields (1) the loop filter time constants in 
seconds, (2) values for the loop filter resistors and capacitors, and (3) 
the loop filter 3-dB frequency. 

The tracking analysis yields (1) hold range in hertz (for active loop 
filters, the hold range is normalized, i.e., assuming F(0) = 1); (2) 
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approximate capture range in hertz; (3) approximate pull-in range in 
hertz; and (4) approximate lockup time in seconds. 

The jitter analysis yields (1) output jitter due to incoming reference 
signal jitter: jitter bandwidth in hertz, frequency of jitter peaking in 
hertz, and noise bandwidth in hertz; (2) output jitter due to reference 
leakage through the phase detector: frequency of first sideband in 
hertz, magnitude of first sideband relative to the carrier in decibels, 
peak phase jitter in degrees, peak frequency deviation in hertz, and 
loop filter’s attenuation of reference frequency in decibels; and (3) 
output jitter due to VCO phase noise: VCO phase noise reduction 3- 
dB frequency. 

The sensitivity analysis computes upper and lower bounds on the 
hold range, capture range, pull-in range, and lockup time. The user 
input is the tolerance of the gain constants and loop filter components. 

The optimization routine asks if the user wants to design for any 
one of the following: (1) larger hold range, (2) larger capture and pull- 
in range, (3) faster lockup time, or (4) less output jitter. 

PLACE then automatically changes the PLL parameters to achieve 
the desired goal; then, the user can rerun the loop filter analysis to 
observe the new component values and use the tracking analysis to 
observe the new lockup time, etc. 

The graphics portion of PLACE plots (on a TEK 4014 or similar 
device) the open and closed loop gains and phase, and the PLL 
response to VCO phase noise. For the graphics portion, the S package® 
must be installed on the system. 


IH. EXAMPLE 


This example demonstrates the lag-lead loop filter case; it also 
demonstrates that the parasitic VCO pole may add sufficient phase 
shift to cause possible instability. _ 

The user input to PLACE is as follows: The PLL for this example 
phase locks a 3.088-MHz crystal oscillator to an incoming 1.544 MHz 
signal. The VCXO has a measured gain constant of 800 Hz/V around 
the center frequency of 3.088 MHz; it also has a parasitic pole at 10 
Hz. The phase detector is a Complementary Metal-Oxide Semicon- 
ductor (CMOS) exclusive-or gate measured at 1.4 V/rad; this was 
derived by measuring the logic high/low levels: (4.6 — 0.2)/m = 1.4. 
The phase comparison is done at 4 kHz; thus, the values for 
the frequency dividers are Nrr = 386 = 1.544 MHz/4 kHz and Npgg = 
772 = 3.088 MHz/4 kHz. The loop filter resistors and capacitors were 
measured accurately, yielding time constants of 7; = 57.4513 ms and 
T2 = 4.00336 ms. 

The results from the stability, loop filter, tracking, and jitter analysis 
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Table |—Comparing PLACE output with measured results 


Parameter Calculated Measured 
Hold range +/—568 Hz +496, —588 Hz 
Pull-in range +/—568 Hz +493, —585 Hz 
Lockup time 324 ms 400 ms 
Frequency of jitter peaking 1.6 Hz 1.5 Hz 
Magnitude of jitter peaking 1.3 dB 3 dB 
Jitter bandwidth 3 Hz 4 Hz 
Frequency of first sideband 8 kHz 8 kHz 
Magnitude of first sideband —52 dBc ~—50 dBc 
Peak base jitter 0.279 degrees 0.343 degrees 
Peak frequency deviation 20 Hz 25 Hz 
Reference frequency attenuation 23 dB 23 dB 





appear in Appendix B. Table I summarizes the calculated results and 
also shows the measured values. The stability analysis indicates a 
damping of 0.7 and natural frequency of 2 Hz. The phase margin is 60 
degrees: the VCO pole reduced the phase margin by 7.4 degrees, while 
the feedback counter caused negligible phase margin degradation of 
0.02 degree. The PLL bandwidth is 1.2 Hz; this low PLL bandwidth 
is due to the very low Ky of 800 Hz/V. The previous results can also 
be found from the PLACE graphic display of open loop gain, shown 
in Fig. 3. Unity gain occurs at 1.2 Hz, where the phase is —120 degrees. 
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Fig. 3—PLACE graphic display of open loop gain for the lag-lead loop filter example. 
The magnitude is indicated by the solid line, and the phase is indicated by the dotted 
line (right scale). 
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Note that the parasitic VCO pole has brought the total phase shift to 
—180 degrees (instead of 0 degrees) for high frequencies. The tracking 
analysis indicates a hold range of +/—568 Hz. The measured value is 
+496 Hz, —588 Hz. The hold range is asymmetrical due to the larger 
than 50-percent duty cycle phase detector output during the lock 
condition. The pull-in range is +493, —582 Hz; it is common for low 
gain loops to have pull-in ranges close to the hold range. 

The calculated lockup time is 324 ms, while the measured time is 
400 ms; this may be found from Fig. 4, a display of tuning voltage 
during the lock-in process. 

The jitter analysis indicates 1.3 dB of jitter peaking at 1.6 Hz, and 
a jitter bandwidth of 3 Hz. The closed loop gain Bode plot is shown in 
Fig. 5. The peaking is difficult to observe given the scale of the plot. 
The measured values may be found from Fig. 6, a spectrum analyzer 
display of closed loop gain versus jitter frequency from an HP 8557A. 
There is 3 dB of jitter peaking at 1.5 Hz (picture only displays up to 
50 Hz). (The measured jitter peaking is higher than calculated due to 
additional poles in the VCXO.) The gain is down 3 dB (jitter band- 
width) at 4 Hz. The test equipment configuration for measuring the 
closed loop gain transfer function of Fig. 6 is shown in Fig. 7. 

The calculated magnitude of the 8-kHz first sideband off the PLL 
output carrier is —52 dB; the measured level is —50 dB. This is shown 
in Fig. 8, a spectrum display from an HP 8557A. The peak phase jitter 
is 0.279 degree; the measured value from an HP 8901A modulation 
analyzer is 0.348 degree. 


3.20 


2.14 


TUNING VOLTAGE 


0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 
TIME IN MILLISECONDS 


Fig. 4—Tuning voltage versus time during acquisition measured on a Paratronics 
5000 logic analyzer. 
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Fig. 5—PLACE graphic display of closed loop gain. The magnitude is indicated by 
the solid line, and the phase is indicated by the dotted line (right scale). 





Fig. 6—Measured closed loop gain versus frequency showing 3 dB of jitter peaking. 
Vertical scale is 10 dB/division, horizontal scale is 5 Hz/division, and resolution 
bandwidth is 1 Hz. 


The sensitivity analysis is also found in Appendix B. The upper and 
lower bounds found there reflect a 5-percent tolerance on the entered 
VCO gain constant, and 2-percent tolerance on the loop filter resistors 
and capacitors. 

The remainder of this paper explains the calculations that PLACE 
performs. 


IV. ANALYSIS 
4.1 Stability analysis 
The stability analysis requires computation of the PLL transfer 
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Fig. 7—Test equipment configuration used to obtain the closed loop gain transfer 
function shown in Fig. 6. 


ir: 





Fig.8—Spectrum of VCO output carrier showing 8-kHz sidebands down 50 dB. 
Vertical scale is 10 dB/division, horizontal scale is 5 kHz/division, and resolution 
bandwidth is 1 kHz. 
function. The components of a PLL are shown in Fig. 1; constants are 
defined in Section 2.1. 

For the following calculations, we define Ky, = 2rKy = VCO gain 
constant in rad/s/V. We also define the lumped gain constant: 


K,Kvr 
. Nrp © 
The equivalent frequency domain model of the PLL is also shown in 
Fig. 1, where F(s) and V;(s) are the Laplace transforms of the loop 


filter transfer function and loop filter output voltage, respectively. The 
VCO model of Ky,/s is derived as follows: 


K= 





d ow 
out — * : = Ky,Ur 
ddou 
L ea = Sout(s) = Ky-Vr(s) 
t 
Ky,V. 
Pout(S) = wee 
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4.1.1 Loop gains 


The transfer function for the PLL may be easily derived if we define 
the feed-forward gain as 


_ Pout(S) = 
G(s) = ee” K,Ky,F(s)/s (1) 
and the feedback gain as 
B(s) = 1/Nrs. (2) 


Then, following conventional control theory analysis, the open loop 
gain is the product of the feed-forward and feedback gains: 





sNep Ss 
The closed loop gain is 
— douwls) Gs) 
As) ="j.(¢) 1+ BAGG) 
= K,Ky,F(s)/s : K,Kyv,F(s) _ NrpK (4) 
1+ K, Ky,F(s)/sNyp St+ K,Ky,-F(s)/Nre Ss re K 


The error transfer function E(s) = (¢.(s))/(¢in(s)) is easily found as 


_ gels) _ 1 
din(s) 1+ B(s)G(s)” 


Note that if there is no feedback counter, G(s) = 1, and then E(s) = 
1 — H(s). 

The loop type is specified by the number of poles at the origin of 
B(s)G(s). The loop order is specified by the total number of poles in 
G(s)G(s). We have ignored the phase delay through the feedback 
counter in the above analysis; this phase delay is treated in Section 
4,1.5. 


E(s) (5) 





4.1.2 Damping, second-order undamped natural frequency wn 


From the above equations we see that the PLL response is highly 
dependent on the form of loop filter F(s). By substituting a particular 
loop filter function F(s) into eq. (4), we can (except for the no loop 
filter case) get the denominator into the classic second-order control 
system form of s? + 2¢w,s + w2, where w, is the undamped natural 
frequency and ¢ is the damping. This is shown in Appendix C. 
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4.1.3 Phase margin 


PLACE calculates the phase of 8(s)G(s) at unity open loop gain; 
the difference from —180 degrees is the phase margin. This definition 
is identical to the stability analysis of feedback amplifiers. 


4.1.4 Phase margin degradation due to the VCO (and/or op amp) pole 


A parasitic pole causes the open loop gain to fall off more quickly 
and thus affect the phase margin. The tuning element of a VCO 
(typically a varactor) cannot respond to rapidly changing tuning 
voltages. We represent this by assigning a pole to the VCO. It has 
been the author’s experience that this VCO pole may be the dominant 
pole of the PLL (especially if VCXOs are used). PLACE determines 
the phase margin degradation due to the parasitic VCO pole by 
multiplying the open loop gain by 1/(1 + 7,s), where 7, = 1/(2zfv), 
where fy is the VCO pole. The VCO pole may be found by impressing 
an ac modulating signal on the dc voltage to the varactor, and deter- 
mining the highest modulating frequency for which the VCO output 
will follow the input. (A typical value for a VCXO is fy = 10 Hz.) 

If the user wants to account for the operational amplifier pole for 
the active loop filter case, he may lump this pole into the VCO parasitic 
pole. The designer should try to keep the operational amplifier band- 
width much larger than the PLL system bandwidth (defined in Section 
4,1.6). 


4.1.5 Phase margin degradation due to the feedback counter 


The effect of the feedback counter is to add a phase delay of f/frer to 
the open loop gain and thus degrade the phase margin by f,,/frer rad, 
where f,, is the frequency for unity open loop gain and fier is the 
frequency entering the phase detector.*” This is derived as follows. 

Since any change in the VCO frequency can be observed by the 
phase detector only after the feedback counter overflows T seconds 
later (worst case), the effect of the counter is to produce a maximum 
delay of up to T seconds, where the delay time T is given by 


T = Ngs(1/fvco) = Nrs ier 1/fret 


N. rB/, ref 
Thus, in the frequency domain we represent the feedback counter by 
a ea) 
Nrp Nes 


PLACE calculates this phase shift and notifies the user of the resulting 
phase margin degradation. 
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4.1.6 PLL system bandwidth in hertz 


The PLL system bandwidth is the frequency for PLL unity open 
loop gain; it is found by solving | 6(jw)G(jw) | = 1 for w. We have the 
following: 


No loop filter: 


F(s) = 1, 
and 
1 
PLLgw = — [K] Hz. 
Qa 
RC loop filter: 
1 
F(s) = ————~ 
(s) (1 + 7s)’ 
and 


PLLaw = A = {V1 + 4(rKy? - 1)” 


= 2 Vata + 1/(4¢4) — 1}? Hz. 


Lag-lead loop filter: 


1 + Tos 


F = 
(s) 1+ 7,8’ 


and 





1] 1 
PLLgw A oes Va? + 4(7,K)? + af” Hz, 


where a = 73 — 1. 
Second-order active loop filter: 
F(s) = (1 + 728)/(718), 


and 
PLLgw = foal C+ V4eF + 1}? Hz. 


For the third-order active loop filter, the user may specify the desired 
PLL bandwidth, or, alternatively, the value defaults to frer/50. PLACE 
uses this value to optimize the loop for best phase noise performance 
(see Appendix C). 


PLACE 1113 


4.2 Loop filter analysis 
4.2.1 Loop filter time constants 


The loop filter time constants are calculated from the undamped 
natural frequency and the damping by using the equations derived in 
Appendix C. Alternatively, the user may directly specify the loop filter 
time constants. 


4.2.2 Loop filter resistors and capacitors 


PLACE assumes 0.1-uf capacitors and calculates the resistors from 
the definition of the time constants. Note that the user can linearly 
scale the resistors and capacitors to any desired value (e.g., if the user 
desires 0.01-uf capacitors, multiply the calculated resistor values by 
10). 


4.2.3 Constraints on lag-lead loop filter 


To maintain real values of resistors and capacitors, certain con- 
straints must be met for the lag-lead loop filter case. PLACE first 
asks for the desired value of w,. Then, when it asks for the desired 
value of damping, the entered value for damping must satisfy two 
conditions: 


1. Since tr. = 2{/w, — 1/K, to ensure a positive 72 we must ensure 


Wn 
o> oR: 


2. Since R,; = (7; — T2)/C, to ensure a positive R; we must ensure 
™T1 > T2 
Koa 1 
ww, K 
K? + @ 
Qu0,K ° 


Hence, after the user specifies the desired value for w,, PLACE asks 
the user to enter a value for damping that satisfies 


i< 


ey len 
2K 20,K ~ 


4.2.4 Loop filter 3-dB frequency or zero frequency 


The frequency at which F(s) is down 3 dB is found by solving 
| F(jw) | = 0.707 for w. We have the following (iff +; > V2r2, else F(s) 
is never down 3 dB): 
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RC loop filter: 





1 
EMe) == 1+7s’ 
and 
Faw = = 2] Hz 
2a |T 
Lag-lead loop filter: 
1+ T28 
F(s) = 1+ 18° 


and 


1 1 1/2 
Few = +] Hz. 


Qn | 7? — 272 


For the second and third active loop filter cases, Fgw is undefined 
because F(s) has a pole at the origin. For these cases PLACE calculates 
the zero frequency. 


4.3 Tracking analysis 
4.3.1 Hold range 


The hold range is defined as the maximum input frequency range 
that the PLL will track once it is in lock. It may be found experimen- 
tally by slowly varying the input frequency to a PLL that is in lock 
and noting when lock is lost. (The word “slowly” is emphasized because 
if the input frequency has a step change, the transient behavior of the 
loop may cause the PLL to loose lock, even though the step is within 
the hold range.) . 

The hold range is independent of the type of loop filter F(s) and is 
given by numerous authors (e.g., see Ref. 6) as K,Kv,-F(0)/Nrs, where 
F (0) is the de gain of the loop filter. (Note that the error voltage is a 
constant dc value when the loop is in lock.) For passive loop filters 
F(0) = 1, and accounting for the feed-forward counter, PLACE com- 
putes the hold range as 


—— Hz. (6) 
For active loop filters F(0), and hence the hold range, can be made 
arbitrarily large (until the VCO or operational amplitude satu- 


rates); thus PLACE computes the normalized hold range (it assumes 
F(0) = 1). 
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When recovering timing from Pulse Code Modulated (PCM) data 
lines, the hold (and capture) range may be asymmetrical due to the 
density of ones.” 

Nonsinusoidal phase detectors theoretically extend the hold and 
capture ranges. For example, triangular phase detectors have a so- 
called “extension factor” of 7/2, while sawtooth phase detectors have 
a factor of 7 (see Ref. 6). PLACE leaves it to the user to modify the 
phase detector gain constant K, (if the designer is so inclined) when 
PLACE requests the value. 


4.3.2 Capture and pull-in range 


The capture range is defined as the maximum input frequency range 
for which a PLL will acquire phase lock without skipping cycles (i.e., 
the phase detector voltage monotonically drives the VCO toward lock, 
without any beating). (Some authors call this the lock-in range.) The 
pull-in range is defined as the maximum input frequency range for 
which a PLL will acquire lock (typically with skipping cycles), even if 
it takes minutes or hours to lock up. (In this case the phase detector 
output voltage may be a beat note swinging the VCO above and below 
fin-) 

Whereas eq. (6) gives an accurate value for the hold range, the 
determination of the capture and pull-in range is extremely difficult 
because the loop starts (by definition) out of lock, and thus a nonlinear 
analysis is required. The solution cannot be expressed in closed form 
(except for the F(s) = 1 case), and only a graphical phase-plane 
trajectory procedure®® will yield true results. PLACE approximates 
the capture and pull-in ranges by using the approximating equations 
found in the literature and modifying them to account for the feedback 
and feed-forward counters: 


No loop filter: 








1 K, Kv, 
Fesptare = frota = foun = a Nyr fo Hz 
(see Refs. 8 and 9). 
RC loop filter: If Kr; < 0.25, 
1 K,Kvr 
feaotaxe = faota = foun = oe N, FF a Hz 


(see Ref. 10). 
Else, 


1 
Teaistute =+ On Nyr2 fonHz, 


1116 TECHNICAL JOURNAL, MAY-JUNE 1985 


and 
1 
Fouli = +— Nyrrl.25@n Hz 
Qa 


(see Ref. 8). 
Lag-lead loop filter: 


1 1 
Tetistire = +— Nerk = = aa Nerrows Hz, 
2a ™ «x20 


(see Refs. 8 and 10), and 


K 1/2 
foull = = Nre24 Kt zs x| Hz 
v 2 


T1 
(see Refs. 10 and 11). 


Second-order active filter: 
1 1 ; 
Feaptute =t+— Npyr2 Fon Sn Nyrtown Hz 
Qr Qr 


(see Ref. 8). 


For the second- and third-order active loop filter cases, the hold and 
pull-in ranges will be as large as the maximum frequency range of the 
VCO, assuming the operational amplifier output does not saturate at 
the supply rail, and assuming negligible loop delay. 


4.3.3 Lockup time 


The lockup time is defined for PLACE as the time it takes the PLL 
to acquire phase and frequency lock from an initial frequency offset 
equal to the pull-in range. For the no loop filter case and low gain 
(Kr; < 0.25) passive loop filter case, PLACE calculates the lockup 
time as Tiock = 1/K seconds.® 

For second-order active loop filters and high-gain (Kz, > 0.25) 
passive loop filters, the lockup time is approximately given by Tio, = 
Aw?/2fw? (see Refs. 6 and 8). PLACE lets Aw = ([feapture + fpun]/ 
2.Nyr)2z7, which is where the approximation is most accurate; for the 
second-order loop filter, PLACE lets Aw = 27(2fcapture). PLACE does 
not calculate the lockup time for the third-order loop filter. 


4.4 jitter analysis 


In the discussion that follows, the term “reference signal” is the 
signal entering the phase detector; thus, if Nrr = 1, the reference 
signal is the incoming signal itself. Also, the terms jitter and phase 
noise are equivalent. 
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The output jitter of a PLL may be conveniently separated into two 
main components: jitter generation and jitter propagation. Jitter prop- 
agation refers to the output jitter due to jitter on the incoming signal 
(or the reference signal for frequency synthesizer applications). Jitter 
generation may be subdivided into five components: output jitter due 
to leakage of the reference signal through the phase detector; output 
jitter due to VCO phase noise; and output jitter due to phase noise of 
the phase detector, frequency divider, and loop filter (if an active loop 
filter is employed). PLACE analyzes each component of jitter sepa- 
rately. 

The most important result of this section may be summarized 
(without mathematics) as follows: A PLL acts as a high-pass filter to 
VCO phase noise, and acts as a low-pass filter to incoming reference 
signal jitter. 


4.4.1 Output jitter due to jitter on the incoming reference signal 

4.4.1.1 Jitter bandwidth. It may easily be shown that the PLL’s 
response to jitter on the incoming reference signal is given by the 
closed loop gain transfer function H(s), eq. (4). Merely repeat the 
analysis of Section 4.1.1 with ¢; now representing an instantaneous 
phase deviation of the incoming reference frequency. Equation (4) is 
repeated here for convenience: 


HG a OO) oo O_o ee 
gin(s) 1+ B(s)G(s) 1+ K,Kv-F(s)/sNyp 
K,Kv,F(s) NenK 


~ s+ K,Ky,F(s)/Nrp  (s/F(s)) + K" 


Thus, above the frequency for unity closed loop gain H(s), the input 
signal jitter will be attenuated. Thus a PLL acts as a low-pass filter 
to jitter on the incoming reference signal. Also, notice from eq. (4) 
that within the PLL system bandwidth, a PLL multiplies the reference 
noise by the feedback counter divisor Nrg. That is, the feedback 
counter effectively adds 20 log Nrg dB/Hz phase noise to the reference 
signal. 

The jitter bandwidth is the frequency at which H(s) is down 3 dB 
from its dc value. If we neglect any parasitic VCO or operational 
amplifier poles, we have the following: 


No loop filter: 
F(s) = 1, 


and 


1 
jittergw = — [K] Hz. 
Qa 
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RC loop filter: 





F(s) = 


? 


1+ 7s 


and 


jittergw = [Alter +1) + V2 — 1) + 1} Hz. 


Lag-lead loop filter: 


1+ T2S 
F(s) = -—— 
(s) 1+ 718° 
and 
1 
jittergw = |e font + Va? + 1}? Hz, 
Tv 
where 
Atw, we 
=2?4+1-=4+4. 
LEAs K 'K 
Second-order active loop filter: 
1+ 
F(s) =, 
™18 


and 


jittersw = 2 wn{2s? + 1+ V(2¢? + 1)? + 1)” Hz. 
“ve 


When there are parasitic poles (which often reduce the bandwidth 
values calculated above), PLACE uses an iterative search to find the 
jitter bandwidth. 

4.4.1.2 Frequency and magnitude of jitter peaking. The peak jitter gain 
| H(Jwpeax) | is equivalent to the peak magnitude of H(s). Jitter peaking 
is the difference (in decibels) between the closed loop gain peak 
magnitude | H(jwpeax) | and the de value of H(s). To find it we first 
find the frequency of peak jitter gain wpea, by setting the derivative of 
| H(jw) |? to zero and solving for wpeax. Then we substitute wpea, back 
into H(jw). Neglecting any parasitic VCO or operational amplifier 
poles, we get the following results: 


No loop filter: 
F(s) = 1, 
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and 
Wpeak = 0 
and 
| H(jwpear) | = 0. 
RC loop filter: 





ENS) = 1+ 7s’ 


Wpeak = WnV1 — 2. 
and 


ee See 
| H(jwpeak) | ad otvT — 


Second-order active loop filter: 


F(s) = pes 


18 
Wpeak = ie {Vi + 8 - 1)”, 


and 


| ; fo 1/2 
| H(jwpeax) | = (45°) fee =. tig —-3+2v(1 + =| 


For the lag-lead and third-order active loop filters, PLACE uses an 
iterative search to find the jitter peaking. The above equations do not 
reveal the jitter peaking due to parasitic poles. PLACE uses an iterative 
search to find the jitter peaking in such cases. 

4.4.1.3 Noise bandwidth. The noise bandwidth of a PLL is the band- 
width of an equivalent rectangular filter that would yield the same 
output noise power (variance) as the PLL, if they both have white 
noise inputs of equal density. PLACE calculates the one-sided noise 
bandwidth, where New = fo | H(jw) |?dw. Various authors have eval- 
uated the integrals, and we have the following: 


No loop filter: 
F(s) = 1, 


and 


Nesw = Hz. 
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RC loop filter: 





1 
F = 
(s) 1+7s’ 
and 
K @, 
Nsw = r 8¢ Hz 
Lag-lead loop filter: 
1+ T92S 
F( ) ‘s 1+ 718° 


and 


Second-order active loop filter: 
F(s) = ——, 
and 


For the third-order active loop filter, the noise bandwidth is approxi- 
mately given by the PLL system bandwidth defined earlier. 

The output noise of the PLL, given a white noise input with power 
Pi, is given by Pour = PinNpw. Similarly, the output signal-to-noise 
ratio of the PLL is inversely proportional to the noise bandwidth. 


4.4.2 Output jitter due to leakage of the reference through the phase 
detector 


4.4.2.1 Frequency and magnitude of first sideband. Any feedthrough of 
the reference frequency through the loop filter phase modulates the 
VCO, producing sidebands at the VCO output. The frequency of the 
first sideband is the frequency leaving the phase detector: twice the 
reference frequency for exclusive-or gate phase detectors, the reference 
frequency for other phase detectors. The magnitude of the first side- 
band is found as follows: the modulating frequency is fm = frer, and the 
Modulation Index (MI) is MI = Af/fm = Af/frer. Using conventional 
frequency modulation/phase modulation analysis, we can express the 
resulting VCO output voltage u(t) by using Bessel functions as 
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Vol Jo(MI)cos(wot) + Ji(MI)cos(wo + wm) t}, 


where we have neglected the higher-order sidebands. Now, if the 
modulation index is small (MI < 0.3), we have (J;(MI))/(Jo(MD) = 
MI/2. For example, (J;(0.2))/(Jo(0.2)) = 0.0995/0.9900 = 0.1 = MI/2. 
Hence, 


First sideband amplitude J;(MI) MI _ Af Af 


VCO carrier amplitude J(MI) 2 22m Aer 


When we realize that most of the signal power is in the carrier for 
small modulation indices, we see that the above equation is the 
definition of the “(f) (see Refs. 12 and 13) used in specifying oscillator 
phase noise. 

Now, Af is the frequency deviation of the VCO: 


Af = AVrKy = Vox| F(jw) | Kv, 


where V,, is the peak amplitude of f,-; at the phase detector output. 
(For exclusive-or gate phase detectors, the phase detector output 
frequency is twice fer.) 

Thus, the magnitude of the first sideband (relative to the carrier) at 
the PLL output is given by 





(7) 


Ky | F(je) 1 (8) 


20 log val D fret 


where F(jw) is evaluated at rer. (This result agrees with Ref. 14.) 
PLACE asks the user for the value of V,,, the peak voltage output 
from the phase detector. 

4.4.2.2 Peak frequency deviation and phase jitter. It can be shown that 
the peak phase deviation is related to “(/) by the following relation: 


Sf f) - ~feet : 


The magnitude of &(f) is given by eq. (7); PLACE 2.0 solves the 
above equation for the peak phase deviation in radians. Then, the 
peak phase jitter in degrees is 57.3 times as large. The peak frequency 
deviation is found by using Afpeak = fmA@peak- 

As long as the peak phase jitter is less than approximately 28 
degrees, the PLL will generally stay in lock (see Ref. 16). 

4.4.2.3 Loop filter’s attenuation of reference frequency. PLACE calcu- 
lates the magnitude of F(s) at the reference frequency and informs the 
user of the result. 


4.4.3 Output jitter due to PLL components 
4.4.3.1 Output jitter due to VCO phase noise. PLACE calculates the 
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PLL’s relative response to VCO phase noise (i.e., it calculates the 
PLL’s attenuation of VCO phase noise); PLACE does not calculate 
the absolute value of the VCO-generated phase noise itself. (References 
9, 12, and 17 explain how to measure the absolute VCO phase noise.) 
A VCO may be modeled as a pure signal source corrupted by an 
instantaneous phase fluctuation ¢vco(t). We modify the PLL model 
by adding this noise source to the VCO of Fig. 1. The total PLL output 
phase is now @out(t) = dioop(t) + évco(t), where ¢roop(t) is the output 
phase for a PLL with an ideal VCO. We now have the following: 


G(s) = din(s) _ Pout ~, 


but 
PourlS) = de(s)F(s)K,Ky/s + dvco(s), 
so that 
ee Pout — Pvco 
F(s)K,Ky/s 
Algebraic manipulations yield 
Nyp KF(s)/s 1 
out = Pin 1 + KF(s)/s + dvco 1+ KF(s)/s 
Pout = Pin ae + ¢vco Ee TENG (9) 
Thus, the PLL response to VCO phase noise is given by 
: (10) 


1 + B(s)G(s)° 


(This result is confirmed in Refs. 5, 8, and 13.) The above equation 
tells us that the —3 dB point of the PLL response to VCO phase noise 
occurs when |{8(s)G(s)| = 1; but this occurs at the PLL system 
bandwidth (defined in Section 4.1.6). Hence, the PLL will pass through 
the VCO phase noise unattenuated above the PLL system bandwidth. 
Below the PLL system bandwidth, the output jitter due to VCO phase 
noise is reduced by the loop gain 8(s)G(s). 

In short, a PLL acts as a high-pass filter to VCO phase noise. 

4.4.3.2 Output jitter due to frequency divider phase noise. In Section 
4.4.1.1 it was shown that the feedback counter effectively adds 20 log 
Nrs GB/Hz phase noise to the reference signal; this section discusses 
the jitter generated in the frequency divider itself. 

Phase noise generated in digital frequency dividers is due to the 
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internal active devices. Typical noise floors for Transistor-Transistor 
Logic (TTL), Emitter-Coupled Logic (ECL), and Metal Oxide Semi- 
conductor (MOS) dividers are —120 to —140 dB. TTL dividers have 
the lowest noise, followed by ECL, and then MOS. (Typical values 
may be found in Ref. 9.) 

As in the case for VCO phase noise, PLACE does not need to know 
the absolute value of the divider noise. The PLL response to this noise 
is the same as for jitter on the incoming signal, because the PLL 
cannot tell whether the noise comes from the divider or signal inputs 
to the phase detector. Thus, the PLL response to divider noise is 
identical to the low-pass filter shape computed for incoming line jitter. 
[The PLL response to noise from an active loop filter is also given by 
eq. (4).] 

In timing recovery circuits, the divider noise is negligible compared 
with the input signal jitter. In frequency synthesizers, however, the 
two may be equal, and then the resultant noise equals their rms sum. 

4.4.3.3 Output jitter due to phase detector phase noise. Typical noise 
floors for phase detectors are —150 to —160 dB/Hz (see Ref. 15). The 
PLL response to this phase noise is again given by eq. (4). The phase 
detector noise floor improves 3 dB per octave of reference frequency 
reduction. 


4.5 Sensitivity analysis 


PLACE asks the user for the tolerance of the VCO and phase 
detector gain constants, and the tolerance on the loop filter resistors 
and capacitors. It then performs a Taylor series expansion and retains 
the first-order terms. Then it calculates upper and lower bounds for 
the hold range, capture and pull-in ranges, and lockup time. This 
allows the designer to observe the effect of component variation on 
the PLL. 


4.6 Optimization routine 


PLACE 2.0 features an interactive routine that allows the designer 
to optimize PLL performance. The user specifies either a larger hold 
range, larger capture and pull-in range, faster lockup time, or less 
output jitter. After the user specifies a desired item to optimize, 
PLACE automatically adjusts the PLL damping and natural frequency 
to achieve the desired goal. 


V. CONCLUSION 


An interactive program to aid in the development of PLLs has been 
written. PLACE has been successfully used in the design of PLLs for 
digital channel banks, muldems, and radio transmission systems. In 
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addition, routine use of PLACE on several existing PLL designs 
revealed problems that were subsequently corrected before the product 
was placed in the field. The program has saved many PLL designers 
from tedious calculations, allowing a better understanding of PLL 
performance. 
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APPENDIX A 

List of Symbols 

E(s) Error transfer function 

F(s) Loop filter transfer function 
Faw Loop filter 3-dB frequency 
Ee Frequency of modulation 
fret Reference frequency 

tii Frequency for unity open loop gain 
fy Frequency of VCO pole 

fvco Frequency of VCO 

Teaptars Capture range 

frola Hold range 
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fou 

G(s) 

H(s) 

| A (jwpeak) | 
K 


Ky 

Ky 

K Vr 

N BW 
N; FB 
N. FF 
PLLgw 
Tock 
UF 
Vr(s) 
B(s) 
Af 

Af, peak 
A dpeak 
£ 


din(s ) 
Pout(S ) 
ge(s) 
Wn 
Wpeak 


APPENDIX B 


Pull-in range 

Feed-forward gain 

Closed loop gain 

Magnitude of jitter peaking 
Lumped gain constant, K = (K,Kv,)/Nre 
Phase detector gain constant in V/rad 
VCO gain constant in Hz/V 
VCO gain constant in rad/s/V 
Noise bandwidth in hertz 
Feedback counter divisor 
Feed-forward counter divisor 
PLL bandwidth 

Lockup time 

Loop filter output voltage 
Loop filter output voltage 
Feedback gain 

Frequency deviation 

Peak frequency deviation 
Peak phase deviation 
Damping 

Input phase 

Output phase 

Phase error 

Natural frequency 

Frequency of jitter peaking 


PLACE 2.0 Output for Lag-Lead Loop Filter Example 
Stability Analysis: 


2nd-Order Undamped 
Natural Frequency 
(Hz) 


2.0 


Phase PLL Bandwidth 
Margin (Open Loop Gain = 1) Damping 
(degrees) (Hz) 
59.3 1.2 0.7 


Your VCO pole at 10.0 Hz reduced the phase margin by 7.41 degrees, and reduced the 
PLL bandwidth by 0.1 Hz. 


Your feedback counter reduces the phase margin by 0.02 degree. 


Loop Filter Analysis: 


Assuming 0.1 uf capacitor, our loop filter components are: 


Loop Filters 
T1=(R1+R2)C T2=R2(C) Ri(ohms)  R2(ohms) C(uf) 3-dBFreq. (Hz) 
0.05745130 0.00400336 534479 40034 0.100000 2.8 
Tracking Analysis: 
Hold Range Capture Range Pull-In Range Lockup Time 
(+/— Hz) (+/-— Hz) (+/— Hz) (sec) 
568.0 39.6 568.0 0.32447433 
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Jitter Analysis: 
1. Output Jitter Due to Incoming Reference Signal Jitter: 


Frequency of Magnitude of Noise Jitter 
Jitter Peaking Jitter Peaking Bandwidth Bandwidth 
(Hz) (dB) (Hz) (Hz) 
1.6 1.272 22.5 3 


2. Output Jitter Due to Reference Leakage Through the Phase Detector: 
Frequency of Magnitude of Peak Phase Peak Frequency Ref. Freq. 


1st Sideband 1st Sideband Jitter Deviation Attenuation 
(Hz) (dBc) (degrees) (Hz) (dB) 
8000 —52 0.278 19.4 23 
3. Output Jitter Due to VCO Phase Noise: 
Your PLL attenuates VCO phase noise BELOW 1.2 Hz. 
Sensitivity Analysis: 
Hold Range Capture Range Pull-In Range Lockup Time 
(+/- Hz) (+/- Hz) (+/— Hz) (sec) 
Lower Bound: 539.0 36.1 539.0 0.30792615 
Nominal: 568.0 39.6 568.0 0.32447433 
Upper Bound: 597.0 43.1 597.0 0.34102252 
APPENDIX C 


Loop Gain Derivation 
C.1 Case 1. No loop filter: F(s) = 1 


From eq. (4), the closed loop gain is 


ste) «Nek __ New 
s/F(s) + K 1+./K’ 


the closed loop gain magnitude is 
1/2 


a 
és 2 ’ 
+(2} 
and the closed loop gain phase is 
(jw) = —arctan(w/K). 


| H(jw) | = Nrp 


From eq. (3), the open loop gain is 
B(s)G(s) = K/s, 
the open loop gain magnitude is 


| B(jw)G(Jw) | = K/e, 
and the open loop gain phase is 
6(jw) = —90 degrees (due to the VCO). 
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Note that the phase margin is 90 degrees and the loop is theoretically 
unconditionally stable; but, if we account for any parasitic VCO pole 
(or any other parasitic poles lumped into the VCO pole), PLACE 
shows that the PLL can be unstable for very high loop gain K. No 
loop filter results in a first-order type-1 PLL. 


C.2 Case 2. RC loop filter: F(s) = 1/(1 + 7s) 
From eq. (4), the closed loop gain is 


Hts) = NrnK  _ — NesK s/t 
s/F(is) +K s*r>+s+K 8s? +8/7+K/r 


We can get the denominator into the classical second-order control 
system form of s? + 2fw,s + w2 if we let w2 = K/r and 2tw, = 1/r. 
Then, 

Nrpwr Nrp 


ee EY chs 


where the undamped natural frequency is 








On = V K/r, 
and the damping is 
2 dd 
2wnt 2VKr 


Thus, the closed loop gain magnitude is 
1/2 


1 


b-(C)] +e) | 


and the closed loop gain phase is 


| H(jw)| = Nrp 


2— 
o(jw) = —arctan — ae 
t=(2) 


Wn 


From eq. (3), the open loop gain is 


B(s)G(s) = ae aa)” 


the open loop gain magnitude is 


1128 TECHNICAL JOURNAL, MAY-JUNE 1985 


|ACs)O(je) | = 


and the open loop gain phase is 6(jw) = —90 —arctan wr. 

Note that the phase starts off at —90 degrees and approaches —180 
degrees at high frequencies. The larger the value of 7, the lower the 
damping and the smaller the phase margin. The RC loop filter results 
in a second-order type-1 PLL. 


C.3 Case 3. Lag-lead loop filter: F(s) = (1 + 72s)/(1 + 7:5) 
From eq. (4), the closed loop gain is 
Nrpk -_ NypK(1 + T2S) 
s/F(s) + K s(1 + 718) + K(1 + 728) 
Neypk(1 + T28)/T1 
1+ Kre + Kk’ 


T1 T1 


H(s) = 


s+s 


We can get the denominator into the classical second-order control 
system form of s? + 2fwps + w% if we let w2 = K/r, and 2fw, = 
(1 + Kr2)/7;. Then, 
1+ s(2¢/w, — 1/K 1 + s(2¢/w, — 1/K 
8° + Wlans + wr Ss Ss 
—}]} +2;—41 
Wn Wn 


where the undamped natural frequency is 





On = VK/n, 
and the damping is 
t= l (2 +24) = Gy 1/K). 
2Wn T1 T1 2 


Thus, the closed loop gain magnitude is 


1 + (2fw/wn — w/K)* ue 


and the closed loop gain phase is 


| H(jo)| = Nr 


o(jw) = arctan (2r elie <) — arctan ——5 
wn K 
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From eq. (3), the open loop gain is 


K(1 + T98) 


B(s)G(s) = aa) 


the open loop gain magnitude is 


: ; K j1 72)°)'” 
[aja Gia) | =H {E+ ora | 


and the open loop gain phase is 
6(jw) = —90 + arctan(w72) — arctan(wr;). 


Note that the phase starts off at —90 degrees, approaches —135 
degrees midway between 1/7, and 1/72, and then approaches —90 
degrees again for high frequencies. 

The lag-head loop filter results in a second-order type-1 PLL. 
(For large gain K, eq. (7) reduces to eq. (8), i.e., the lag-lead loop filter 
is an approximation of the second-order active loop filter.) 


C.4 Case 4. Second-order active loop filter: F(s) = (1 + r2s)/(715). 


If it is desired to account for the operational amplitude pole, the 
designer may lump this pole into the VCO parasitic pole. Try to keep 
the operational amplitude bandwidth much larger than the PLL 
bandwidth (defined in Section 4.2). 

From eq. (4), the closed loop gain is 

H(s) = NypK - NrpK(1 + 72s) - NypK(1 + 728)/71 

s/F(s) + K  s?x7, + K(1 + ros) 9 8? + sKt0/7, + K/71 
We can get the denominator into the classical second-order control 
system form of s? + 2fw,s + w? if we let w2 = K/7, and 2tw, = 


Kr2/7,. Then, 
Nrpw ( + 26 7 


n 


1 + s(2¢/wn) 


s? + Qtons + wr ( 


H(s) = : 
*) +2¢—+1 


Wn Wn 
where the undamped natural frequency is 
wn = VK/1, 
and the damping is 


t= 1 Kr2 _ nT? 


Yan T1 oF 7 
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Thus, the closed loop gain magnitude is 


1/2 
2 
1+4 (2) 
. Wn 
\2Ge)| = Nrs4 ae er 
b-(S)] ee) 
Wn Wn 
and the closed loop gain phase is 
2¢— 
Wn 


(jw) = arctan[2{(w/wn)] — arctan ——"—3. 
1-(3) 

On 
From eq. (3), the open loop gain is 


B(s)G(s) = ae 


2 ? 


the open loop gain magnitude is 
: . K 
| B(ja)G(jw) | = —G V1 + (wr2)’, 
1 
and the open loop gain phase is 
6(jw) = —180 + arctan(wre). 

Note that the phase starts off at —180 degrees and approaches —90 
degrees at high frequencies. Thus, the second-order PLL exhibits a 
large phase margin, even with parasitic (VCO) poles. Ensure 7; < 72 
for good stability. 

This loop filter results in a second-order type-2 PLL. 
C.5 Case 5. Third-order active loop filter: F(s) = (1 + 72s)/[71s(1 + 735)] 

From eq. (4), the closed loop gain is 

Nrpk = NrpK(1 + T9S) 
s/F(s) + K — 8°7,(1 + 738) + K(1 + 728) 


NypkK(1 + T28) 
s 37173 + 827, + sKto + K’ 


H(s) = 


Thus, the closed loop gain magnitude is 


1 + (wre)? i 


[K — w*r,)? + [wKr, — w°7173]? 


| H(jw)| = Nrak | 
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and the closed loop gain phase is 


(wKr2 — w*t73) 


o(jw) = arctan(wr2) — arctan (K — wr) 


From eq. (3), the open loop gain is 


K(1 + T2S) 
718°(1 + 738)’ 


B(s)G(s) = 


the open loop gain magnitude is 


9) 1/2 
isGncats= {teal 


Tw 11 + (wr)? 





and the open loop gain phase is 
6(jw) = —180 + arctan(wr2) — arctan(w73). 


Note that the phase starts off at —180 degrees, approaches —135 
degrees midway between 72 and 73, and then approaches —180 degrees 
again for high frequencies. 

For the third-order active filter case, it is possible to define an 
equivalent damping and natural frequency if we let 1/73 be much 
higher than 1/72. Rather than making this approximation, PLACE 
solves directly for the phase margin using an algorithm (see Ref. 18) 
that optimizes phase noise performance. The point of minimum phase 
shift (i.e., the inflection point of the open loop gain phase) is placed 
at exactly the frequency w, for unity open loop gain. (The user can 
specify w,, or else PLACE defaults the value to f,, = frer/50.) This will 
occur when the loop filter time constants obey the following relations: 


- sec(pm) — tan(pm) 


T3 r 
T3 
n= 
2 w? 
= K Loe (wuT2)” a 
et [1 + (urs? f 


PLACE asks the user for the desired phase margin and computes the 
above values for 71, 72, and 73. Later the user has the opportunity to 
enter his own values for these variables. 
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ERRATA 


In the March 1985 issue (Vol. 64, No. 3) on Assuring High Reliability 
of Lasers and Photodetectors for Submarine Lightwave Cable Systems, 
the list of editorial staff on the inside front cover should have included 
the following: “R. L. Hartman, Coordinating Editor of the submarine 
lightwave cable reliability issue.” 


In the paper by Z. L. Budrikis and M. Hatamian, “Moment Calcula- 
tions by Digital Filters,” AT&T Bell Laboratories Technical Journal, 
Vol. 68, No. 2 (February 1984), the following corrections are noted: 


Page 219: equation (3): The upper limit on the 
summation terms should be “n” instead of “N”. 


Page 220, Table I, third column, first row: 


zZ 

oe 7 Should be -— i: 

Page 220, Table I, fifth column, first row: 
u(n) should be u(n-1). 


Page 220, the equation in the middle 
of the page between eq. (5) and eq. (6): 
u(n) should be u(n-1). 








In the paper by M. Hatamian and E. G. Bowen, “Homenet: A Broad- 
band Voice/Data/Video Network on CATV,” AT&T Technical Jour- 
nal, Vol. 64, No. 2, Part 1 (February 1985), the following corrections 
are noted: 


Page 348, second paragraph, second line: 
Ref. 1 should be Ref. 2. 
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